solutions VS constraints

  <purpose>
  You, as the AI ASSISTANT, must take (a) a list of 100 ranked solutions from [[solutions_100]] and (b) the target deployment constraints from [[constraints_text]], then produce a NEW ranking of consolidated “Solution Bundles” (a.k.a. stacks).

  Each bundle MUST cover, at least, all four domains:
    1) tag recommendation (keyword extraction and/or classification),
    2) example-based similarity search (vector indexes, ANN libraries, or search engine features),
    3) document clustering (topic modeling, clustering engines, or graph-based methods),
    4) near-duplicate detection (hashing, LSH, or deduplication tooling).

  Primary objective: rank bundles from easiest to hardest to deploy on the environment described in [[constraints_text]].
  Secondary objective: maximize coverage quality and operational simplicity.

  Success criteria:
    - Every ranked bundle covers at least all four domains.
    - Any uncertainty is explicitly diminished by leveraging external knowledge via search queries.
</purpose>

<context>
  <grounding_rules>
    <rule>Feel free to search for external knowledge to find any dependency/library/tool/runtime requirement that is not explicitly stated within the [[solutions_100]] and [[constraints_text]].</rule>
    <rule>When ARM64/aarch64 compatibility is not explicitly stated, leverage external knowledge via search queries to find out.</rule>
    <rule>When you make a scoring decision, include at least one supporting quote from [[constraints_text]] or cite the exact solution text that motivated it.</rule>
  </grounding_rules>

  <ranking_principles>
    <principle>Prefer fewer moving parts: fewer services, fewer build steps, fewer always-on daemons.</principle>
    <principle>Prefer components that match the available runtime/toolchain described in [[constraints_text]].</principle>
    <principle>Disregard docker/kvm/virtualization approaches because those capabilities are supported in [[constraints_text]].</principle>
  </ranking_principles>

  <scoring_model>
    <ease_of_deployment>
      <definition>
        A 0–100 score estimating how straightforward it is to install, build (if needed), and operate the bundle on the target machine.
        Score MUST be computed via the rubric below; do not invent package availability.
      </definition>
      <rubric>
        <![CDATA[
        Start at 100 and subtract:
          -40 if requires GPU/CUDA or any GPU-only dependency (unless [[constraints_text]] explicitly supports it, like 'mali G610 (rk3588)' GPU-based approaches).
          -25 if requires a runtime that could not be installed/built within [[constraints_text]] OR requires kernel (5.10) or libc6 (2.31) upgrades.
          -15 for each always-on external service required (e.g., JVM service, DB, search engine) beyond the first.
          -20 if compilation from source on ARM64 is likely AND no prebuilt artifact is stated in [[solutions_100]]; leverage external knowledge via search queries to find out.
          -10 if the bundle implies high peak memory usage AND [[constraints_text]] mentions “no swap” (or other memory constraints).
          -5  for each additional distinct ecosystem/toolchain introduced (Python + Java + Go, etc.).

        Add back (cap at 100):
          +10 if the bundle can be a single-process deployment.
          +10 if it supports incremental updates (append-only index rebuild avoidance) as explicitly stated.
        ]]>
      </rubric>
    </ease_of_deployment>

    <coverage>
      <definition>
        A 0–100 score estimating how explicitly and robustly the bundle covers each required domain.
      </definition>
      <rubric>
        <![CDATA[
        25 points per domain if the bundle names an explicit approach AND at least one explicit tool/library.
        +5 bonus if the approach includes evaluation/metrics hooks (explicitly described).
        -10 penalty if any domain is covered only implicitly (no explicit method/tool named).
        ]]>
      </rubric>
    </coverage>
  </scoring_model>
</context>

<input_data>
  <solutions_100>[[
`````````solutions_100
~~~~~~
"""

"""  
~~~~~~
`````````
  ]]</solutions_100>
  
  <constraints_text>[[
`````````constraints_text
~~~~~~
"""
url: `https://raw.githubusercontent.com/ib-bsb-br/ib-bsb-br.github.io/refs/heads/main/_posts/2024-07-20-vpc3588.md`


"""  
~~~~~~
`````````  
  ]]</constraints_text>
  
</input_data>

<instructions>
  <instruction>1) Extract an environment summary from [[constraints_text]] as factual key-value fields (arch, OS, language runtimes, RAM, swap, and any runtime that is explicitly missing).</instruction>
  <instruction>2) Parse [[solutions_100]] into a normalized list of 100 items with fields: id, original_rank, name (if present), description, declared_tools, declared_runtimes/services (search in external knowledge if those informations are not present).</instruction>
  <instruction>3) For each solution, label which domains it covers: {tag_recommendation, similarity_search, document_clustering, near_duplicate_detection}. Use “partial” when the solution only provides a building block.</instruction>
  <instruction>4) Build candidate bundles by combining complementary solutions so every bundle fully covers all 4 domains. Keep bundles minimal.</instruction>
  <instruction>5) For each bundle, write an explicit “capability mapping”:
    - name the approach for each domain (e.g., TF-IDF keywords, supervised classifier, HNSW ANN, k-means, HDBSCAN, MinHash, SimHash),
    - list explicit tools/libraries drawn from the included solutions,
    - if any tool/runtimes are unstated, leverage external knowledge via search queries.</instruction>
  <instruction>6) Score each bundle:
    - compute ease_of_deployment strictly via the <ease_of_deployment><rubric>,
    - compute coverage strictly via the <coverage><rubric>,
    - include a bullet “scoring_rationale” with (a) the applied rubric penalties/bonuses and (b) supporting evidence quotes.</instruction>
  <instruction>7) Rank bundles by: ease_of_deployment DESC, then coverage DESC, then fewer services/toolchains.</instruction>
  <instruction>8) For each ranked bundle, provide:
    - included_solution_ids,
    - excluded_but_relevant_solution_ids (high-ranked originals you did not use) with reasons tied to constraints or redundancy,
    - a pragmatic implementation plan (high-level steps) and a plain-text pipeline diagram.
    Do not claim specific packages exist in Debian repos unless explicitly stated in [[constraints_text]] or [[solutions_100]] or found within external knowledge search results.</instruction>
</instructions>
URL: https://ib.bsb.br/sol2con