solutions VS constraints

Slug: sol2con

7080 characters 751 words
<purpose> You, as the AI ASSISTANT, must take (a) a list of 100 ranked solutions from [[solutions_100]] and (b) the target deployment constraints from [[constraints_text]], then produce a NEW ranking of consolidated “Solution Bundles” (a.k.a. stacks). Each bundle MUST cover, at least, all four domains: 1) tag recommendation (keyword extraction and/or classification), 2) example-based similarity search (vector indexes, ANN libraries, or search engine features), 3) document clustering (topic modeling, clustering engines, or graph-based methods), 4) near-duplicate detection (hashing, LSH, or deduplication tooling). Primary objective: rank bundles from easiest to hardest to deploy on the environment described in [[constraints_text]]. Secondary objective: maximize coverage quality and operational simplicity. Success criteria: - Every ranked bundle covers at least all four domains. - Any uncertainty is explicitly diminished by leveraging external knowledge via search queries. </purpose> <context> <grounding_rules> <rule>Feel free to search for external knowledge to find any dependency/library/tool/runtime requirement that is not explicitly stated within the [[solutions_100]] and [[constraints_text]].</rule> <rule>When ARM64/aarch64 compatibility is not explicitly stated, leverage external knowledge via search queries to find out.</rule> <rule>When you make a scoring decision, include at least one supporting quote from [[constraints_text]] or cite the exact solution text that motivated it.</rule> </grounding_rules> <ranking_principles> <principle>Prefer fewer moving parts: fewer services, fewer build steps, fewer always-on daemons.</principle> <principle>Prefer components that match the available runtime/toolchain described in [[constraints_text]].</principle> <principle>Disregard docker/kvm/virtualization approaches because those capabilities are supported in [[constraints_text]].</principle> </ranking_principles> <scoring_model> <ease_of_deployment> <definition> A 0–100 score estimating how straightforward it is to install, build (if needed), and operate the bundle on the target machine. Score MUST be computed via the rubric below; do not invent package availability. </definition> <rubric> <![CDATA[ Start at 100 and subtract: -40 if requires GPU/CUDA or any GPU-only dependency (unless [[constraints_text]] explicitly supports it, like 'mali G610 (rk3588)' GPU-based approaches). -25 if requires a runtime that could not be installed/built within [[constraints_text]] OR requires kernel (5.10) or libc6 (2.31) upgrades. -15 for each always-on external service required (e.g., JVM service, DB, search engine) beyond the first. -20 if compilation from source on ARM64 is likely AND no prebuilt artifact is stated in [[solutions_100]]; leverage external knowledge via search queries to find out. -10 if the bundle implies high peak memory usage AND [[constraints_text]] mentions “no swap” (or other memory constraints). -5 for each additional distinct ecosystem/toolchain introduced (Python + Java + Go, etc.). Add back (cap at 100): +10 if the bundle can be a single-process deployment. +10 if it supports incremental updates (append-only index rebuild avoidance) as explicitly stated. ]]> </rubric> </ease_of_deployment> <coverage> <definition> A 0–100 score estimating how explicitly and robustly the bundle covers each required domain. </definition> <rubric> <![CDATA[ 25 points per domain if the bundle names an explicit approach AND at least one explicit tool/library. +5 bonus if the approach includes evaluation/metrics hooks (explicitly described). -10 penalty if any domain is covered only implicitly (no explicit method/tool named). ]]> </rubric> </coverage> </scoring_model> </context> <input_data> <solutions_100>[[ `````````solutions_100 ~~~~~~ """ """ ~~~~~~ ````````` ]]</solutions_100> <constraints_text>[[ `````````constraints_text ~~~~~~ """ url: `https://raw.githubusercontent.com/ib-bsb-br/ib-bsb-br.github.io/refs/heads/main/_posts/2024-07-20-vpc3588.md` """ ~~~~~~ ````````` ]]</constraints_text> </input_data> <instructions> <instruction>1) Extract an environment summary from [[constraints_text]] as factual key-value fields (arch, OS, language runtimes, RAM, swap, and any runtime that is explicitly missing).</instruction> <instruction>2) Parse [[solutions_100]] into a normalized list of 100 items with fields: id, original_rank, name (if present), description, declared_tools, declared_runtimes/services (search in external knowledge if those informations are not present).</instruction> <instruction>3) For each solution, label which domains it covers: {tag_recommendation, similarity_search, document_clustering, near_duplicate_detection}. Use “partial” when the solution only provides a building block.</instruction> <instruction>4) Build candidate bundles by combining complementary solutions so every bundle fully covers all 4 domains. Keep bundles minimal.</instruction> <instruction>5) For each bundle, write an explicit “capability mapping”: - name the approach for each domain (e.g., TF-IDF keywords, supervised classifier, HNSW ANN, k-means, HDBSCAN, MinHash, SimHash), - list explicit tools/libraries drawn from the included solutions, - if any tool/runtimes are unstated, leverage external knowledge via search queries.</instruction> <instruction>6) Score each bundle: - compute ease_of_deployment strictly via the <ease_of_deployment><rubric>, - compute coverage strictly via the <coverage><rubric>, - include a bullet “scoring_rationale” with (a) the applied rubric penalties/bonuses and (b) supporting evidence quotes.</instruction> <instruction>7) Rank bundles by: ease_of_deployment DESC, then coverage DESC, then fewer services/toolchains.</instruction> <instruction>8) For each ranked bundle, provide: - included_solution_ids, - excluded_but_relevant_solution_ids (high-ranked originals you did not use) with reasons tied to constraints or redundancy, - a pragmatic implementation plan (high-level steps) and a plain-text pipeline diagram. Do not claim specific packages exist in Debian repos unless explicitly stated in [[constraints_text]] or [[solutions_100]] or found within external knowledge search results.</instruction> </instructions>
URL: https://ib.bsb.br/sol2con