<purpose>
You, as the AI ASSISTANT, must take (a) a list of 100 ranked solutions from [[solutions_100]] and (b) the target deployment constraints from [[constraints_text]], then produce a NEW ranking of consolidated “Solution Bundles” (a.k.a. stacks).
Each bundle MUST cover, at least, all four domains:
1) tag recommendation (keyword extraction and/or classification),
2) example-based similarity search (vector indexes, ANN libraries, or search engine features),
3) document clustering (topic modeling, clustering engines, or graph-based methods),
4) near-duplicate detection (hashing, LSH, or deduplication tooling).
Primary objective: rank bundles from easiest to hardest to deploy on the environment described in [[constraints_text]].
Secondary objective: maximize coverage quality and operational simplicity.
Success criteria:
- Every ranked bundle covers at least all four domains.
- Any uncertainty is explicitly diminished by leveraging external knowledge via search queries.
</purpose>
<context>
<grounding_rules>
<rule>Feel free to search for external knowledge to find any dependency/library/tool/runtime requirement that is not explicitly stated within the [[solutions_100]] and [[constraints_text]].</rule>
<rule>When ARM64/aarch64 compatibility is not explicitly stated, leverage external knowledge via search queries to find out.</rule>
<rule>When you make a scoring decision, include at least one supporting quote from [[constraints_text]] or cite the exact solution text that motivated it.</rule>
</grounding_rules>
<ranking_principles>
<principle>Prefer fewer moving parts: fewer services, fewer build steps, fewer always-on daemons.</principle>
<principle>Prefer components that match the available runtime/toolchain described in [[constraints_text]].</principle>
<principle>Disregard docker/kvm/virtualization approaches because those capabilities are supported in [[constraints_text]].</principle>
</ranking_principles>
<scoring_model>
<ease_of_deployment>
<definition>
A 0–100 score estimating how straightforward it is to install, build (if needed), and operate the bundle on the target machine.
Score MUST be computed via the rubric below; do not invent package availability.
</definition>
<rubric>
<![CDATA[
Start at 100 and subtract:
-40 if requires GPU/CUDA or any GPU-only dependency (unless [[constraints_text]] explicitly supports it, like 'mali G610 (rk3588)' GPU-based approaches).
-25 if requires a runtime that could not be installed/built within [[constraints_text]] OR requires kernel (5.10) or libc6 (2.31) upgrades.
-15 for each always-on external service required (e.g., JVM service, DB, search engine) beyond the first.
-20 if compilation from source on ARM64 is likely AND no prebuilt artifact is stated in [[solutions_100]]; leverage external knowledge via search queries to find out.
-10 if the bundle implies high peak memory usage AND [[constraints_text]] mentions “no swap” (or other memory constraints).
-5 for each additional distinct ecosystem/toolchain introduced (Python + Java + Go, etc.).
Add back (cap at 100):
+10 if the bundle can be a single-process deployment.
+10 if it supports incremental updates (append-only index rebuild avoidance) as explicitly stated.
]]>
</rubric>
</ease_of_deployment>
<coverage>
<definition>
A 0–100 score estimating how explicitly and robustly the bundle covers each required domain.
</definition>
<rubric>
<![CDATA[
25 points per domain if the bundle names an explicit approach AND at least one explicit tool/library.
+5 bonus if the approach includes evaluation/metrics hooks (explicitly described).
-10 penalty if any domain is covered only implicitly (no explicit method/tool named).
]]>
</rubric>
</coverage>
</scoring_model>
</context>
<input_data>
<solutions_100>[[
`````````solutions_100
~~~~~~
"""
"""
~~~~~~
`````````
]]</solutions_100>
<constraints_text>[[
`````````constraints_text
~~~~~~
"""
url: `https://raw.githubusercontent.com/ib-bsb-br/ib-bsb-br.github.io/refs/heads/main/_posts/2024-07-20-vpc3588.md`
"""
~~~~~~
`````````
]]</constraints_text>
</input_data>
<instructions>
<instruction>1) Extract an environment summary from [[constraints_text]] as factual key-value fields (arch, OS, language runtimes, RAM, swap, and any runtime that is explicitly missing).</instruction>
<instruction>2) Parse [[solutions_100]] into a normalized list of 100 items with fields: id, original_rank, name (if present), description, declared_tools, declared_runtimes/services (search in external knowledge if those informations are not present).</instruction>
<instruction>3) For each solution, label which domains it covers: {tag_recommendation, similarity_search, document_clustering, near_duplicate_detection}. Use “partial” when the solution only provides a building block.</instruction>
<instruction>4) Build candidate bundles by combining complementary solutions so every bundle fully covers all 4 domains. Keep bundles minimal.</instruction>
<instruction>5) For each bundle, write an explicit “capability mapping”:
- name the approach for each domain (e.g., TF-IDF keywords, supervised classifier, HNSW ANN, k-means, HDBSCAN, MinHash, SimHash),
- list explicit tools/libraries drawn from the included solutions,
- if any tool/runtimes are unstated, leverage external knowledge via search queries.</instruction>
<instruction>6) Score each bundle:
- compute ease_of_deployment strictly via the <ease_of_deployment><rubric>,
- compute coverage strictly via the <coverage><rubric>,
- include a bullet “scoring_rationale” with (a) the applied rubric penalties/bonuses and (b) supporting evidence quotes.</instruction>
<instruction>7) Rank bundles by: ease_of_deployment DESC, then coverage DESC, then fewer services/toolchains.</instruction>
<instruction>8) For each ranked bundle, provide:
- included_solution_ids,
- excluded_but_relevant_solution_ids (high-ranked originals you did not use) with reasons tied to constraints or redundancy,
- a pragmatic implementation plan (high-level steps) and a plain-text pipeline diagram.
Do not claim specific packages exist in Debian repos unless explicitly stated in [[constraints_text]] or [[solutions_100]] or found within external knowledge search results.</instruction>
</instructions>
URL: https://ib.bsb.br/sol2con