NCC presenting the success story
NCC-Bulgaria has been founded by the Institute of Information and Communication Technologies at
the Bulgarian Academy of Sciences, Sofia University “St. Kliment Ohridski” and the University of
National and World Economy.NCC-Bulgaria is focused on:
- Creating a roadmap for successful work in the field of HPC, big data analysis and AI,
- Analyzing the existing competencies and facilitating the use of HPC/HPDA/AI in Bulgaria
- Raising awareness and promoting HPC/HPDA/AI use in companies and the public sector.
Scientific partners involved:
Concept Digital (conceptdigital.bg) is a digital marketing agency helping customers in e-commerce, online campaigns, influencer marketing, Facebook advertising, copywriting, graphic design, e-mail marketing, and more.



Technical/scientific Challenge:
After the initial deployment, the main challenge shifted from integration breadth to orchestration precision. The platform already supported many MCP servers, but production usage showed that selecting the best subset for each request is a high-impact optimization problem. A broad activation strategy increases latency and irrelevant tool candidates, while an overly narrow strategy can miss critical capabilities. A second challenge was multimodal consistency. The system needed stable decision policies for text, images, and document-derived content in the same session, while preserving security and explainability. This required modality-aware routing that determines when to invoke OCR or vision tooling, when to prioritize text analytics, and when to combine both in staged execution.
Solutions:
This iteration introduced a context-aware MCP server selection engine with three stages: intent decomposition, scenario classification, and constrained server ranking. User requests are decomposed into atomic intents, then classified by session metadata such as role, urgency, compliance sensitivity, and expected output format. Candidate servers are ranked using a weighted policy that combines relevance, historical success rate, response latency, and security-scope fit. Only top-ranked and policy-compliant servers are exposed to the acting agent. Post-execution scoring updates ranking priors for similar future contexts, enabling adaptive optimization with full auditability.
To preserve governance, administrators can pin mandatory servers, define deny lists, and enforce organization-level constraints. This keeps human control over selection behavior while improving tool relevance and reducing unnecessary execution overhead.
For multimodal support, the orchestrator now runs coordinated text-plus-image and text-plus-document pipelines. It can route screenshots, scans, and visuals to OCR/vision tools, normalize extracted content, and merge it with textual reasoning in a single traceable workflow. This enables end-to-end tasks such as analyzing ad creatives, extracting report evidence, and generating final campaign deliverables in one session.
The platform is organized as an orchestration backend with modular MCP adapters (see Figure 1).
Request Ingress Layer accepts user input, session metadata, and role context.
- The Intent Decomposition Layer splits complex prompts into atomic intents (research, extraction, drafting, summarization, generation, etc.).
- Scenario Classification Layer labels the request by business scenario (role, urgency, compliance sensitivity, expected output type).
- Server Ranking Layer scores candidate MCP servers and tools using weighted criteria.
- Policy Enforcement Layer applies allow/deny rules, pinned services, and scope constraints.
- Execution Layer runs selected tool chains and streams intermediate steps for transparency.
After each workflow, the orchestrator records execution traces and outcomes (completion quality, retries, latency, policy exceptions). These signals are used to update ranking priors for similar future contexts (see Figure 2).
The multimodal pipeline introduces modality-aware routing:
- Image-heavy tasks are routed through vision-capable tools;
- Scanned or screenshot documents are routed through OCR extraction;
- Text reasoning remains in LLM planning/execution paths.
Scientific impact:
- Demonstrates that MCP-based AI systems can optimize orchestration through context-sensitive ranking policies instead of static server activation.
- Reduces context-window pressure, token usage, and workflow latency by exposing only relevant capabilities for each scenario.
- Extends transparent agent execution to multimodal workflows with reproducible cross-modal reasoning and auditable traces.
Benefits:
The optimized server-selection strategy reduced unnecessary tool calls and shortened workflow completion time in production-like scenarios. The users can execute complex requests with fewer retries because the initial server shortlist better matches user intent. Multimodal support expanded practical value for marketing operations. Users can evaluate ad visuals, summarize presentation screenshots, extract information from scanned materials, and combine these outputs with campaign text generation and reporting in one continuous process. Organizations also gained stronger governance through policy-driven server access, role-sensitive defaults, and full trace visibility for each multimodal execution.
Success story # Highlights:
- Context-aware MCP server selection with adaptive ranking;
- Production-ready multimodal workflows (text, image, document);
- Improved latency, relevance, and governance in real business scenarios.
Figure 1: Context-aware server selection pipeline with intent decomposition, scenario classification, and ranked MCP shortlist generation
Figure 2: Multimodal orchestration flow combining text reasoning, image understanding, OCR extraction, and output synthesis
Figure 3: Policy layer for organization-level overrides, user-level preferences, and security-bound execution constraints
Contact:
- Venko Andonov, University of National and World Economy, Sofia, vandonov[at]unwe.bg
- Prof. Valentin Kisimov, University of National and World Economy, Sofia, vkisimov[at]unwe.bg