- Creating GEDCOM data examples and implementing pipeline
- Building semantic graph and grounding with Agda libraries
- Defining GEDCOM structure and pipeline
- Creating GEDCOM examples and structure
- Encoding GEDCOM example structures in Agda
- Defining and structuring a minimal tree in Agda
- Creating`Iso`between semantic graph and canonical serial
- Exploring GEDCOM tree and isomorphism
- Clarifying file_search tool behavior
- Preparing the search query for file_search
- Searching for relevant modules in agda-libraries
- Clarifying`pathToIso`from Cubical Agda
- Explaining loops in Cubical Foundations
- Confirming pathToIso definition and library inclusion
- Confirming pathToIso and preparing recipe adaptation
- Inferring recipes and structuring the solution
- Creating a structured pipeline
- Clarifying user request and tracing path
- Clarifying approach for reification/serialization pipeline
- Outlining GEDCOM structure and path-to-Iso usage
- Modeling step-parent relationships in GEDCOM
- Representing step-child relationships in GEDCOM
- Refining GEDCOM extensions and Agda examples
- Designing Cubical code for GEDCOM examples
- Designing Agda code with GEDCOM grounding
- Structuring GEDCOM records in Agda
- Designing a pipeline from free tree to semantic graph
- Creating GEDCOM examples grounded in specification
- Creating GEDCOM examples with proper pointers and references
- Creating a complete genealogy with lateral relationships
- Modeling step-parent relationships in GEDCOM
- Modeling adoption and family relationships in GEDCOM
- Structuring family, lateral, and collateral relationships
- Integrating extension records and custom tags
- Defining dataset and reflection process in Agda
- Generating detailed GEDCOM dataset examples and schema
- Citing alternative approaches and related repositories
- Citing repositories and exploring alternative approaches
- Structuring output with JSON and citations
- Formatting GEDCOM examples and citations properly
- Clarifying the approach for pathToIso and canonicalization
- Clarifying the loop for canonicalization and equality
- Explaining the univalence loop and pipeline
- Demonstrating finite index-based encoding
- Crafting domain-faithful response with examples
- Clarifying GEDCOM pipeline stages
- Defining “mini-GEDCOM” subset for simplification
- Defining canonical serial and semantic graph with simple proofs
- Designing a simple pipeline for GEDCOM data
- Clarifying the Iso loop and pipeline design
- Explaining the pipeline and iso loop
- Proposing a solution structure
- Crafting the six recipes
- Crafting GEDCOM module with minimal definitions
- Defining canonical and semantic graphs
- Defining uniqueness and computed fields in serial
- Designing serial and semantics representation
- Reflect and reify as definitional inverses
- Defining`Sem`and`Serial`for genealogical data
- Redesigning child’s relation storage in the semantic graph
- Refining the semantics and enforcing bidirectional pointers
- Clarifying type constraints and decidability issues
- Choosing between DecEq and simplifying design
- Choosing Id types with decidable equality for Sem and Serial
- Computing membership and defining reify function
- Reflection on Sem to Serial conversion and canonicalization
- Considering ISO and providing examples with proofs
- Adapting for a compiling domain-faithful loop
- Designing`serialWF`for semantics
- Defining reflect and reify with serialWF
- Formalizing serial and semantic structure
- Defining semantics and simplifying serial representation
- Defining and showing isomorphism between Sem and SerialWF
- Defining canonical serial and proving isomorphism
- Defining isomorphism between serial and canonical structure
- Defining a canonical serial structure and isomorphism
- Defining semantic graph and canonical serial relationship
- Defining pipeline structure for genealogical data
- Breaking down the genealogical data pipeline
- Creating loops for demonstrative examples
- Crafting response structure
- Providing code structure and examples
- Ensuring citations and Agda code structure
- Defining GEDCOM types and handling tags
- Defining key structures
- Defining semantic graph and serial records
- Defining well-formedness (WF)
- Defining a canonical serial representation
- Defining the canonical serial with iso
- Simplifying the graph representation and encoding
- Outlining use cases and next steps
- Planning the response and generating example dataset
- Creating GEDCOM structure and family definitions
- Creating dummy datasets and encoding in Agda
- Defining`Line`type and dataset representation
- Creating an Agda example for iso loops
- Defining WF and pointers
- Defining derived functions in SemGraph
- Clarifying canonical serial vs. semantic graph
- Defining canonical serial for iso with semantics
- Refining the canonical serial and pipeline
- Considering iso between serial and semantics
- Planning the user’s iso task with variations
- Defining a skeleton for iso implementation
- Aligning Family Structures and Representation
- Crafting Deliverables for GEDCOM Subset
- Ensuring Alternatives
- 0) What`pathToIso`actually gives you (and why it’s perfect for “transport loops”)
- 1) The “univalence/transport loop” you want, in one picture
- 2) Domain scaffold: genealogies are graphs, so model the “structure vs serialization” boundary explicitly
- 3) Recipe A: Reification/serialization as an`Iso`, thenturn it into a type equality, then back to`Iso`via`pathToIso`
- 4) Recipe B: The “round-trip correctness ⇒ equality of genealogies” workflow (your Structure→Property→Equality loop)
- 5) Recipe C:Alpha-equivalence / person-renamingas a path, then`pathToIso`gives the renaming program for free
- 6) Recipe D: “Decomposition of a structure” (record ↔ Σ/× normal form), then reuse generic theorems on Σ-types
- 7) Recipe E: Transportingproperties(invariants) across your structure/serialization equality using`substIso`
- 8) Recipe F: Proving the loop really closes —`pathToIso (isoToPath f)`is your original`Iso`(for sets)
- 9) How to apply these patterns to “Free Tree genealogy reification” specifically
- Alternative approaches (often useful alongside`pathToIso`)
- Related GitHub repositories / codebases to mine for patterns
- Example 1 — Step-family + laterals + void-child placeholder + documented extensions + marriage age
- Example 2 — Adoption modeled with`ADOP`+`FAMC`+ “who adopted” enum
- Example 3 — Two different extension-tag spellings, same URI semantics
- Example 4 — Redaction/removal preserved structurally via`@VOID@`
- Core Cubical tools we’ll rely on
- 0) Minimal GEDCOM-faithful core types
- 1) Iso combinators you’ll use constantly
- 2) Lifting`Iso`through GEDCOM record cores
- 3) The dataset Iso (the heart of reindexing + schema normalization)
- Example IDs and extension schemas
- The example dataset (core reification of Example 1)
- Canonicalization step 1: normalize extension tags to URIs
- Transport computation: get rid of mysterious transports
- Recipe A — Optional pointers vs`@VOID@`pointers
- Recipe B — Documented extension tags: rename surface tags by URI semantics
- Recipe C — XREF reindexing: “string IDs” → finite indices (canonical)
- Recipe D — The Structure→Property→Equality→Structure loop for GEDCOM “well-formedness”
- Recipe E — Alternative GEDCOM modeling choices become isomorphic canonical models
- Recipe F — Adoption normalization (PEDI vs ADOP-event)
- Use`Equiv`/`ua`(univalence) instead of`Iso`, and reserve`Iso`for computational extraction
- Use SIP (Structure Identity Principle) to turn “structure-preserving renaming” directly into paths
json
Copy code
{
"primary_intent"
:
"Provide extensive, practical Cubical Agda use-cases for `pathToIso` by building domain-faithful Paths from GEDCOM-grounded genealogy 'states of affairs', then extracting `Iso` via `pathToIso`, completing Structure↔Property↔Equality↔Structure for a free tree → canonical serial → semantic graph pipeline."
,
"deliverables"
:
[
"Several detailed dummy GEDCOM v7 fragments (INDI/FAM, pointers, PEDI/ADOP, ASSO/ROLE, void pointers, SCHMA/TAG documented extensions, relocated standard structures) grounded in the FamilySearch GEDCOM specification"
,
"A Cubical Agda model (Ptr/voidPtr, INDI/FAM cores, schema/tag meaning via URIs, finite reindexing) aligned with the pipeline free-tree → canonical-serial → semantic-graph"
,
"Concrete 'recipes' (A–F) showing how domain 'state of affairs' facts induce Paths, how to recover `Iso` with `pathToIso`, and how to use `isoToPath` + `transport` (incl. `transportIsoToPath`) to move properties across representations"
,
"Agda snippets demonstrating `Iso → isoToPath → transport → pathToIso` loops on the genealogy reification/serialization domain"
,
"At least one alternative approach and a list of related GitHub repositories"
]
,
"directives"
:
[
{
"text"
:
"Begin the response with a JSON object inside a code block; then provide the full answer, including at least one alternative approach and related GitHub repositories."
}
,
{
"text"
:
"Ground GEDCOM claims in the provided specification markdown and use file citations."
}
,
{
"text"
:
"Ground Cubical Agda choices in the provided agda-libraries.txt and use file citations."
}
,
{
"text"
:
"Use web browsing for up-to-date/niche items that benefit from freshness (e.g., GitHub repos) and cite sources."
}
,
{
"text"
:
"Provide thorough, practical, code-oriented explanation; do not promise future work."
}
]
,
"output_requirements"
:
[
"Extensive, practical, domain-faithful examples"
,
"Concrete Agda 'recipes' and snippets using `Iso`, `isoToPath`, `pathToIso`, `transport`"
,
"Clear citations for specification/library facts"
,
"Include alternative approach + related GitHub repos"
]
,
"tooling"
:
{
"available_tools"
:
[
"file_search"
,
"web.run"
,
"python"
,
"container"
]
,
"tool_constraints"
:
[
"Use `file_search` for the uploaded GEDCOM spec markdown and agda-libraries.txt; cite with markers."
,
"Use `web.run` for current GitHub repository references; cite with markers."
,
"Do not put file citations inside LaTeX blocks; avoid citations inside code fences to keep code compilable."
]
}
,
"context"
:
{
"facts"
:
[
"User provided: The FamilySearch GEDCOM v7 specification as a markdown file and as a URL."
,
"User provided: agda-libraries.txt listing relevant Cubical Agda modules/functions around transport/Glue/path/Iso."
,
"Requested domain: complete genealogy with laterals/collaterals/marriages/step-parents etc; pipeline free tree → canonical serial → semantic graph."
]
,
"assumptions"
:
[
"We model a faithful *subset* of GEDCOM (INDI, FAM, pointers, PEDI/ADOP, ASSO/ROLE, SCHMA/TAG, voidPtr) sufficient to demonstrate the Iso/Path/transport loops."
,
"We demonstrate canonicalization via (1) schema normalization (extension tags → URIs) and (2) reindexing (XREF-like IDs → finite indices)."
,
"Snippets are written to match Cubical Agda conventions; Agda cannot be typechecked in this environment, so correctness is best-effort and aligned with standard Cubical library interfaces."
]
,
"unknowns"
:
[
"Exact earlier 'Recipe A/B/C/D/E/F' wording from the previous assistant message; recipes are reconstructed in a compatible, practical form."
,
"User’s exact internal ‘free tree’ constructors; the answer provides a reusable pattern that can be adapted."
]
}
,
"safety"
:
{
"risk_flags"
:
[
]
,
"refusal_requirements"
:
[
]
}
,
"inputs"
:
[
"User request for extensive GEDCOM-grounded dummy examples + Cubical Agda recipes for Iso+isoToPath+pathToIso loops in a genealogy reification domain."
]
}
I’m focusing on generating examples based on the GEDCOM specification (latest version: 7.0.18). I’ll create dummy GEDCOM dataset fragments and demonstrate how to construct a path from a domain problem to a semantic graph. The goal is to write Agda code with minimal constructs while avoiding overly complex features.
#Creating GEDCOM data examples and implementing pipeline
The user requests dummy GEDCOM data based on FamilySearch’s GEDCOM 7 specification. I’ll create examples using GEDCOM tags like INDI, FAM, and others, including step-parents with PEDI links. These examples will serve as input for the “free tree → canonical serial → semantic graph” pipeline, ensuring it follows the correct structure and transformations.
#Building semantic graph and grounding with Agda libraries
I’ll create a genealogy structure using identifiers likeXreforPersonId,
and demonstrate the relationship between the semantic graph and canonical serial form using
isomorphisms. I’ll cite the relevant Agda libraries, focusing onagda-libraries.txtfor module imports. Once I locate the necessary code snippets,
I’ll provide appropriate citations.
#Defining GEDCOM structure and pipeline
I’ll cite the truncated snippet and search for relevant parts of the Agda libraries, especially
aroundpathToIsoandisoToPath. I also need to highlight GEDCOM’s
hierarchical container format, using lines like Level, Xref, Tag, and values. This will help
create a parser and model the serialized data as a node tree. We’ll include citations accordingly.
#Creating GEDCOM examples and structure
I’ll define dummy GEDCOM examples including basic families, complex structures (like marriages, adoption, step-parents), sibling and collateral relations through associations, and extension tags. These examples will adhere to the GEDCOM spec, using citations for grammar, schema mapping, and pseudo-structures. Cross-references, pointers, and well-formedness predicates will help ensure data validity.
#Encoding GEDCOM example structures in Agda
I will begin by defining key GEDCOM elements in Agda, such as tags, pointers, and line values.
Tags will be strings or enumerated types, pointers likeXrefwill reference via strings
or records, and I’ll abstract line values as either pointers or plain strings. Then, I’ll model
hierarchical structures using nodes and a tree representation of dataset lines.
#Defining and structuring a minimal tree in Agda
For the free tree, I’ll use a minimal tree structure where each record is a node with potential
children. This simplifies parsing GEDCOM lines into a hierarchical tree. While a full parser might
be complex, I can focus on minimal parsing with toy functions and demonstrate core functions likepathToIsoandisoToPath. For the user’s request, I’ll aim for code that
compiles and aligns with the desired pipeline.
#Creating`Iso`between semantic graph and canonical serial
The goal here is to create an isomorphism (Iso) between the semantic graph
(Genealogy) and canonical serial type (Σ Serial WF). I’ll simplify the
process with justINDIandFAMrecords, implementingreifyandreflectas identity functions for a basic, easy-to-verify structure. I’ll aim for
a pipeline that ensures minimal complexity and shows the connection to GEDCOM lines.
#Exploring GEDCOM tree and isomorphism
The idea here is that parsing may lose ordering, but we can still show an isomorphism between the canonical serial and a “canonical GEDCOM tree” with fixed ordering. Once that’s done, we can convert this tree back to a list of lines. However, parsing might be complex, so I’ll focus on simple dummy examples and generalizing them later. For now, I’ll make sure to ensure the code compiles using standard modules, add necessary imports, and look up relevant sections from the GEDCOM spec.
#Clarifying file_search tool behavior
It seems there’s some confusion about how thefile_searchtool functions. The system
suggests that it can search user-uploaded files without needing asource_filter, but
the developer message includes additional requirements likesource_filter, which
doesn’t align with earlier tool descriptions. I’ll proceed with the system’s original schema to
avoid errors.
#Preparing the search query for file_search
I’ll stick with the system tool schema and avoid using thesource_filtersince it’s
not part of the official schema. The developer message looks like it might be for something
internal. I’ll craft my main query to match the user’s request, and also include others related
to GEDCOM specifics. Time to make the call!
#Searching for relevant modules in agda-libraries
Now that I have relevant snippets with citations, I need to search the “agda-libraries” file for
modules related topathToIso. Since I can’t specify a file directly, I’ll craft the
right queries and hope the results pull from the correct file. Let’s give it a try!
#Clarifying`pathToIso`from Cubical Agda
The Agda-libraries file has modules related topathToIso, including inCubical.Foundations.Isomorphism, which convertsA ≡ BtoIso A B.
The user originally mentionedpathToIsofrom Cubical Agda, though there may be another
version inCubical.Categories.Category.Base. I’ll keep these distinctions in mind as I work.
#Explaining loops in Cubical Foundations
We need to show loops between structure, property, equality, and structure. In Cubical Foundations,isoToPathconvertsIso A BtoA ≡ B, andpathToIsoprovides the inverse by forming a canonicalIsofrom the path using transport. Univalence
relatesA ≡ BtoA ≃ B, but obtaining anIsorequires more than
mere equivalence.
#Confirming pathToIso definition and library inclusion
There is indeed a definition ofpathToIsowhich involvestransport p,
inverses liketransport (sym p), and possible round trip usingtransportRoundTrip. I think the library includespathToIso, but to be sure,
I’ll search forpathToIso :in the agda-libraries file. Let’s check.
#Confirming pathToIso and preparing recipe adaptation
Great, the agda-libraries file confirms thatpathToIsoand similar functions, likesubstIso, are inCubical.Foundations.Transport. I see its definition, which
mapsA ≡ BtoIso A B. Now, I must focus on crafting practical use cases and
adapting the requested recipes (A/B/C/D/E/F) for the domain, with relevant Agda code snippets. We’ll
also have to reconstruct the recipes referenced earlier.
#Inferring recipes and structuring the solution
Let’s infer the user’s earlier recipes based on the request:
- Recipe A: Iso for semantic to canonical transformations (isoToPath, etc.)
- Recipe B: Σ-types with well-formedness
- Recipe C: Using extensionality for equality
- Recipe D: Possibly involving
sip - Recipe E: Focused on reification/serialization
- Recipe F: Using pathToIso
I can’t rely on the previous context, but I can adapt these recipes to the GEDCOM dataset, showing clear transitions through the univalence/transport loop. I’ll structure GEDCOM-like examples to demonstrate the process from free tree to semantic graph representation.
#Creating a structured pipeline
I’ll outline a pipeline based on what the user needs:
- Free Tree: A hierarchical structure representing the raw GEDCOM dataset (forest of nodes). This is astructure.
- Canonical Serial: A normalized, level-coded list of lines ensuring well-formedness. This defines the property.
- Semantic Graph: A typed graph connecting individuals, families, and relationships, representing semantics.
I’d useIsoto link semantic and canonical representations, applyingisoToPathandpathToIsoto close the loop by transporting properties.
#Clarifying user request and tracing path
The user wants a path constructed from their domain problem, like “decomposing a full genealogy tree”
through reification/serialization. The goal is to extract an Iso directly from the created path in
Cubical Agda, completing the Structure→Property→Equality loop. They’re also asking for practical
examples with GEDCOM dataset and recipes. I’ll adapt the existing patterns into concrete Agda
snippets, considering an ID type (Id) for people and using relationships overIdfor structure. I’ll focus on clear use cases with simplified GEDCOM specs.
#Clarifying approach for reification/serialization pipeline
Incorporating a well-formedness predicate is key to constructing a GEDCOM-like pipeline. We will
use distinctIdtypes for INDI and FAM records, representing relationships as edges
referencing those IDs. From there, the structure will be serialized and normalized into a canonical
GEDCOM AST. Then, we’ll linearize it into GEDCOM lines but may also explore index- or cross-reference-based approaches.
The crucial part is linking the canonical and semantic representations withpathToIso,
allowing us to extract an Iso directly from a constructed pathway across types.
#Outlining GEDCOM structure and path-to-Iso usage
This looks like we’re aiming to illustrate howIsolinks semantic and canonical representations
viaisoToPathandpathToIso. One key problem is showing the structure-to-equality
cycle. We’ll start with GEDCOM-related content: cite its specifications (e.g., grammar, cross-references)
and set up GEDCOM fragments with cross-references. We’ll demonstrate these relationships, including family
structures, adoption, and step-parent child connections. This gives us building blocks to show path-to-Iso
in practice, specifically data restructuring and type equality proofs.
#Modeling step-parent relationships in GEDCOM
I realized that GEDCOM 7 doesn’t directly have a “STEP” value for step-parents in thePEDIenumeration. Instead, I could use “OTHER” with the phrase “step”, or possibly use “FOSTER.” TheFAMCpointer can represent the family relation, andEVENevents could further help define it,
though step-parents aren’t specifically represented as events.
#Representing step-child relationships in GEDCOM
I realized that in GEDCOM, step-children can be linked to a family record that includes both a step-parent
and a biological parent. Although the spec advises against having multiple FAMC pointers to the same family,
I can model this by using separateFAMpointers for each parent. For step-family, “FOSTER” or
“OTHER” with the phrase “step” may work well to represent the relationship, alongside the adoptionADOPevent.
#Refining GEDCOM extensions and Agda examples
To properly define extension tags like_LOCand others, I’ll refer to the spec about howHEAD.SCHMA.TAGworks. I’ll need to make sure to cite earlier snippets about extension tags,
line grammar, and theCONTline continuation for string escaping. For Agda examples, I’ll ensure
the code is self-contained, taking care with differences between theCubicallibrary and Agda’s built-ins.
#Designing Cubical code for GEDCOM examples
For coding, I’ll useopen import Cubical.Foundations.Preludeand relevant libraries forIso,isoToPath, andpathToIso. I’ll defineXreffor individual
and family IDs using eitherStringorℕ, leaning towardStringfor dummy
examples without worrying about strict equality. Cubical’sStringis built-in, and I’ll treat it as a label for now.
#Designing Agda code with GEDCOM grounding
For mapping individuals and families, I can use lists and invariants likeisProp, avoiding the need for
equality checks. I’ll defineIdtypes, such asIndiIdandFamId, as finite sets
(Fin norℕ), representing GEDCOM xrefs as enumerations. We’ll show both GEDCOM snippets and the
Agda types encoding these lines, using abstract types likeIdto facilitate indexing. I’ll parameterize the code
to allow for flexible representation.
#Structuring GEDCOM records in Agda
I want to design canonical serial and semantic graph representations of GEDCOM records. For the canonical serial,
I’ll create aCanonrecord with lists for individuals and families, each representing details like names, sex,
families, and events. The semantic graph will split people and families into distinct lists. We’ll create an isomorphism
between the two to ensure they represent the same data and can be converted one to another.
#Designing a pipeline from free tree to semantic graph
The idea is to create a pipeline from free tree (arbitrary GEDCOM lines) through a canonical serial (enforcing GEDCOM constraints) to a semantic graph (internal representation). I’ll build an isomorphism between these stages, using well-formed data types and proofs in Cubical Agda, following the pattern of free trees and their well-formedness proofs.
#Creating GEDCOM examples grounded in specification
I need to generate extensive dummy examples based on the FamilySearch GEDCOM specification. This includes starting with a
HEAD segment and ending with a TRLR, ensuring order. I’ll use version “7.0.18” for GEDC.VERS, map extension tags like_LOCand_ILP, and use proper line formatting, following the spec’s structure.
#Creating GEDCOM examples with proper pointers and references
I’ll create a sample with line breaks usingCONT @@me...to match the spec format and show cross-references like@VOID@for unknown pointers. We’ll include FAM and INDI records, detailing events like BIRT and ADOP. The FAMC field
will be used to indicate biological or adopted family relationships, citing relevant sections for each.
#Creating a complete genealogy with lateral relationships
I’ll model step-parents and lateral relationships like godparents using PEDI values and associations like ROLE with enums like “GODP” or “OTHER.” I’ll include ALIA records to link aliases for multiple INDI records. The genealogy dataset will include detailed relationships, marriages, children, and step-children.
#Modeling step-parent relationships in GEDCOM
In GEDCOM, you can include Alice (John and Mary’s child) as CHIL in F2, representing her as a step-child in the new family structure. To clarify the step-parent relationship, F2 would list the step-mother (Anne) as WIFE, and Alice’s FAMC pointer in I3 would show “step-parent (Anne)” to indicate the step-relationship.
#Modeling adoption and family relationships in GEDCOM
To capture adoption, I’ll create a family F3 with Anne as WIFE and Charlie as the only child (I5). For adoption by a single person, I’ll use FAMC and ADOP substructure in I5 and mark the relationship as “ADOPTED” with PEDI. For Charlie’s biological family, I could use void pointers (@VOID@) for unknown parents as allowed by the spec.
#Structuring family, lateral, and collateral relationships
We could omit HUSB/WIFE lines for unknown parents or use void pointers, as the spec allows. For laterals (siblings, cousins), I’ll add Bob as a sibling to Alice in F1 and create F5 for cousin relations. For godparents, I’ll use ASSO for I1 and I8 with a “brother-in-law” role. I’ll also incorporate extensions like _LOC for events and legal guardianship (_ILP) where appropriate.
#Integrating extension records and custom tags
Extension tags like_LOCcan potentially be used at level 0 if allowed by the grammar. Though the formal specification of
extension records isn’t entirely clear, these can define new structures. I’ll treat the_LOCrecord as a custom example, showing
location-related tags, but mention it’s not strictly verified under the GEDCOM standard.
#Defining dataset and reflection process in Agda
We can define theRawDatasetas a list of GEDCOM lines and ensure proper order with aWFproperty—checking for
HEAD/TRLR, ensuring xrefs point to valid records (except @VOID@), and enforcing bi-directional pointers between FAM and INDI. We’ll aim for
reification/serialization, reflection, and corresponding proofs. Since it’s a complex example, I’ll provide an extensive explanation, code patterns,
and perhaps a relevant GitHub repository for Agda-based GEDCOM parsing.
#Generating detailed GEDCOM dataset examples and schema
- agda.readthedocs.io (Cubical)
- github.com/FamilySearch/Gedcom
- gedcom.io/tools
- cqts.github.io (Transport)
- familysearch.org/gedcom
- pypi.org/project/python-gedcom
- proofassistants.stackexchange.com (Cubical Agda Q&A)
- python-gedcom docs
- jl5000.github.io/gedcomS7
- cj-xu.github.io (Cubical.Foundations.Path)
- topetit.github.io (pygedcom)
#Citing alternative approaches and related repositories
We can recommend GitHub repositories like FamilySearch’s GEDCOM official repo and the GEDCOM-registries repository for YAML registries. If we mention Agda-related repositories like agda/cubical, it might be worth citing. I’ll primarily cite from local files and offer web sources for additional repo recommendations.
#Citing repositories and exploring alternative approaches
- github.com/agda/cubical
- agda.readthedocs.io (latest Cubical)
- deepwiki.com/agda/cubical
- agda.github.io/cubical Prelude
- my-agda.readthedocs.io (Cubical)
#Structuring output with JSON and citations
The answer should start with a JSON code block, including candidate instructions, intentions, and content units. After that, I’ll provide an explanation, including an alternative approach and related GitHub repositories. Citations should be placed outside of code blocks to ensure they display properly but will need to avoid using bold formatting in markdown.
#Formatting GEDCOM examples and citations properly
When presenting the GEDCOM examples, I need to ensure that citations (whether internal file or web-based) are placed outside code blocks to make sure they render correctly. The user requested dummy GEDCOM examples; I’ll provide those in plain text, and I’ll make sure to include citations in explanatory text around code blocks.
#Clarifying the approach for pathToIso and canonicalization
I need to provide multiple GEDCOM fragments, use them to define constants in Agda, and then demonstrate theIso,isoToPath, andpathToIsoloops. These should show how a specific path is constructed
from a domain problem, and then from that path, I’ll extract the corresponding Iso. I also need to ensure that any
canonicalization algorithm creates unique output.
#Clarifying the loop for canonicalization and equality
The main idea is to construct a path between two types, derived from univalence or iso. We start with the semanticsSem, define a propertyWFon the serializedSerial, and showSemandSerial WFare isomorphic. From there,isoToPathprovides the path, which then leads to usingpathToIsoto return to the original structure, completing the loop.
#Explaining the univalence loop and pipeline
We start withSemIso : Iso Sem Canon, then useisoToPathto show the path betweenSemandCanon. WithpathToIso, we can recover the isomorphism, moving between types by transport. This extends
to dependent types as well. The free tree → canonical serial → semantic graph pipeline involves showing isomorphisms and
conversions between these stages, providing the necessary paths and equality.
#Demonstrating finite index-based encoding
To demonstrate this, I can show how to represent concepts withFin nfor indexing. Once you have an isomorphism
betweenIdandFin n, you can usepathToIsoto transport a graph from anId-based
structure to an index-based structure automatically. This helps in reifying to a finite, index-based encoding for storage or algorithms.
#Crafting domain-faithful response with examples
To compile the response, I’ll reference pathToIso and substIso definitions along with transportIsoToPath lemmas. The response will
include code modules and GEDCOM examples, embedded as code blocks, showing fragments ofGEDCOMLineandFreeTreerepresentations. ForGEDCOMLine, I’ll use enumerations, and forFreeTree, I’ll consider a hierarchicalNode. Additionally, I’ll outline the bridging process between the…
#Clarifying GEDCOM pipeline stages
The pipeline appears to flow as: free tree → canonical serial → semantic graph. The free tree serves as an uncanonicalized GEDCOM
representation, while the canonical serial refines it with normalization and invariants. The semantic graph is derived from the canonical
serial. I’ll define the free tree asGTree, representing any GEDCOM structure, and canonical serial asCanon Σ Free WF. The final step is extracting the semantic graph (Graph).
#Defining “mini-GEDCOM” subset for simplification
I’m considering that while Graph and Canon may not be entirely isomorphic (since some info may be lost in the transition), the user wants a domain-faithful iso for the pipeline. Since the full GEDCOM spec is large, we’ll focus on a more manageable subset, calling it “G7-mini,” concentrating on key records and relationships like INDI, FAM, HEAD, and TRLR, along with additional features.
#Defining canonical serial and semantic graph with simple proofs
I’m thinking about defining the canonical serial type as a record with lists of individuals and families. The semantic graph could be similarly
structured, and the isomorphism would be trivial. For the pipeline, I might refine definitions of canonical serial as aList Recordand semantic asGenealogy, with proofs of isomorphism based on splitting and merging records while maintaining invariants.
#Designing a simple pipeline for GEDCOM data
To match the user’s request, I’ll start by structuring a simple pipeline:
- Sem: A typed graph with entities like
PersonandFamily. - Canon: A canonical GEDCOM 7 subset AST, covering
GEDCvers,schema, and lists of individuals/families. - Free: A “free tree” representation of tags and values, defined with a tree structure.
I’ll keep the design simple but faithful to the problem!
#Clarifying the Iso loop and pipeline design
The user mentioned using an “Iso + isoToPath + pathToIso loop”, which fits with my “free tree → canonical serial → semantic graph” pipeline. So, the iso loop probably happens between the free tree and the semantic graph, with the canonical serial as the intermediate step. The path-building happens with the free tree before canonicalization and semantics are applied.
#Explaining the pipeline and iso loop
The pipeline is defined as free tree → canonical serial → semantic graph. They want to close a loop between structure, property, and equality: starting with semantics (structure), producing canonical serial (structure + well-formedness), then proving equality and iso. The free tree is input, canonical serial is output of serialization, and the semantic graph reflects parsing. The iso between free tree and canonical serial would hold if order doesn’t matter, but ordering might need adjustments.
#Proposing a solution structure
- Focus on the iso between the semantic graph and canonical serial.
- Show optional iso between canonical serial and raw GEDCOM lines list.
- For the free tree, use a projection (
forget) from canonical serial to free tree and an embedding (embed) from free tree plus property to canonical serial, structured as a Σ-type.
#Crafting the six recipes
- Recipe A: Define CanonSerial as Σ FreeTree WF (Structure→Property).
- Recipe B: Create reflect and reify functions between SemGraph and CanonSerial, prove they form isomorphisms (Property→Structure).
- Recipe C: Use isoToPath to establish SemGraph ≡ CanonSerial; apply transport (Structure→Equality).
- Recipe D: Recover canonical conversions using pathToIso along the path (Equality→Structure).
- Recipe E: Use iso between Id and Fin n to transport SemGraph (Indexing/Finite Reification).
- Recipe F: Compose multiple isomorphisms into a single path/iso across the pipeline (Normalization/compositional isos).
#Crafting GEDCOM module with minimal definitions
To import essential modules, we’ll bring in the required cubical libraries:
{-# OPTIONS --cubical #-}
open import Cubical.Foundations.Prelude
open import Cubical.Foundations.Isomorphism
open import Cubical.Foundations.Transport
open import Cubical.Data.Sigma
open import Cubical.Data.List
open import Cubical.Data.Maybe
open import Cubical.Data.Nat
open import Cubical.Data.Unit
open import Cubical.Data.Fin
open import Cubical.Data.Sum
Now, the code grouping begins under the moduleGEDCOM.PathToIso.UseCases. After defining enumeration types like Sex and Pedigree,
I’ll set up record types for family (FamRec) and individual (IndiRec) with the vital fields. We’ll carry out minimal
definitions but with enough to compile successfully.
#Defining canonical and semantic graphs
I’ll defineCanonas a record containing lists of individuals (IndiRec) and families (FamRec).
ASemGraphrecord will be similar, but it stores persons and families using nodes, which are justIndiRecandFamRec.
The difference betweenCanonandSemGraphis in how they store relationships. The canonical serial keeps bi-directional
pointers (from individuals to families and back), while the semantic graph normalizes this, storing families and persons without redundancy. This
approach ensures that the canonical serial has consistent, reciprocal pointers, while the semantic representation focuses on simplicity.
#Defining uniqueness and computed fields in serial
The original serial may have duplicates due to ordering, but the canonical definition allows us to compute individual pointers (like FAMS/FAMC)
from family records, ensuring uniqueness. These pointers won’t be stored separately but are computed, aligning with theCanonproperty.
For isomorphism, the stored FAMS/FAMC values must match the computed ones exactly, ensuring consistency. This makes the reflect/reify functions
inverse to each other under this property.
#Designing serial and semantics representation
We could define a newSerialrepresentation, wherepersonsarePersonCoretypes without family pointers,
whilefamiliesareFamCore. Derived functions likefamsOfwould recreate family pointers, but this doesn’t
show bidirectionality. To capture that, we introduce aWFproperty, where canonical serial represents bothSerialand theWFproperty. Semantics would avoid redundant pointers, aligning with theSerialWFisomorphism, ensuring derived pointers
match the stored ones.
#Reflect and reify as definitional inverses
If we define pointers based on derived lists, the equality proofs become reflexive (refl). In that case, reflect and reify could be seen as definitional inverses of each other, essentially transforming the data between different representations while maintaining consistency between them.
#Defining`Sem`and`Serial`for genealogical data
Here’s my attempt at defining two representations:
Sem(semantic graph) stores persons with attributes and families with relationships (parents and children).Serial(canonical form) also stores individuals and families but includes additional data like families’ pointers and children’s pedigrees.
The goal is to keepSemsimplified without redundancy, focusing on relationships, whileSerialcould potentially include
more structure, likeFamRecandIndiRec.
#Redesigning child’s relation storage in the semantic graph
In GEDCOM, the child’s relationship (PEDI) is actually stored in the individual’s FAMC substructure, not the family record (CHIL). I’m thinking about translating this structure into the semantic graph. Each family will store children, and relationship types (PEDI) will be stored as attributes of the child-family edges.
#Refining the semantics and enforcing bidirectional pointers
In this design, the family record contains a list of child IDs, while the individual record has a list of (family ID × pedigree), leading to
redundancy between FAM.CHIL and INDI.FAMC. Spouse relationships are also mirrored. To avoid this, we enforce theWFproperty for
bidirectional relationships. The semantics will only store each relationship once, enough to derive FAMC and FAMS for each individual.
#Clarifying type constraints and decidability issues
The idea is to keep the code manageable by definingSemin a generic way, avoiding direct computation forreify. For
functions likefamsOf, we use recursion over a list, but without decidable equality on Id, we can’t check for membership. To work
around this, we might restrict Id to types with decidable equality, such asFin nor useDiscreteif necessary for
generic Id types.
#Choosing between DecEq and simplifying design
We can define our ownDecEqtype for comparing values with a defined decidable equality. This could be based onCubical.Relation.Nullary. However, if we includeDecEq, the code becomes more robust but less flexible. Another option
is to store redundant FAMS/FAMC lists inSem, though that wouldn’t be ideal. To simplify, we might chooseId = Fin n,
which allows for easier membership computation and index-based reification, aligning with GEDCOM xref names.
#Choosing Id types with decidable equality for Sem and Serial
I’m leaning towards usingFin nfor individual and family IDs, as it allows for defined equality. We can setnIndiandnFamseparately, and for a dummy dataset, I could define enumerated Id types as Fin. To demonstrate the iso between Sem and Serial, I
also plan to implement scanning functions for family and individual relationships, usingmapandfilteroperations on the data.
#Computing membership and defining reify function
We need to compute membership in the children list. I’ll implement this ourselves using DecEq andany. Additionally, we’ll computefamcentries along with pedigree info. In the Sem format, each family holds aChildRelrecord for each child with relationship
and pedigree; scanning through it allows us to collect the needed data for both child and family records. It’s pretty straightforward to convert this
to the Serial format, transforming lists as needed.
#Reflection on Sem to Serial conversion and canonicalization
To convert from Serial to Sem, we need to join family records with pedigree info from individual records. For missing pedigree, we’ll assume a default value (like BIRTH). To maintain isomorphism, Serial will always include PEDI (even if optional in the spec). We’ll adopt a canonicalization rule to explicitly include PEDI and ensure data consistency. Similarly, optional FAMS pointers will be canonicalized for isomorphism.
#Considering ISO and providing examples with proofs
To preserve isomorphism, we can use projections between records and Σ types, or potentiallyIso. While the user asked for code, a good
solution might be to provide both: a general skeleton using postulates for theorem equivalence proofs, and a fully computed example with proofs for a
small dataset showing reflexivity.
#Adapting for a compiling domain-faithful loop
For a dummy dataset, definingSemExandSerialExworks to show howreify SemExequalsSerialExby
reflexivity. However, the user wants code for a compiling domain-faithful loop. I’ll provide general functions withdecEq,reify/reflect, and a concrete dummy instantiation, using postulates for proofs where necessary. This shouldn’t require proving inverses.
#Designing`serialWF`for semantics
I’m thinking about using the cubical library’sIsobuilder with forward/back transformations, and showing direct equality viareflwhen we choose definitional inverses. ForserialWF, I could define it as a record that includes both the person’s core
and families, with redundant lists fixed via computations. This gives an isomorphism toSem.
#Defining reflect and reify with serialWF
If I defineserialWFas a record, I can usereflectto extract the core and family edges, andreifyto construct
the serial with computed redundant lists. This ensures thatreflect(reify(s))equalssby definition. Usingwfguarantees the equality of these redundants, so no need for complex lemmas—just rely on thewfproperty for proof.
#Formalizing serial and semantic structure
To define theSerial, I need to includepersons,families,fams, andfamcfields, and
compute redundant lists from the family edges. We’ll store relationships in theWFproperty. Thereflectfunction can transfer
these relationships fromserialWFtoSem, with minimal lookups. For simplicity, storing relationships on the child side helps
avoid complexity.
#Defining semantics and simplifying serial representation
Let’s define the semantic graph as a record with fields for persons, families, and child relations with pedigree info. Then, simplifying the serial
representation to store family children lists without pedigree and using computed projections helps avoid complex list calculations. I’ll utilizeΣtypes and provide skeleton code, focusing on Iso conversions with patterns, explanations, and concrete examples, including postulates where needed.
#Defining and showing isomorphism between Sem and SerialWF
We can define an isomorphism betweenSemand…
#Defining canonical serial and proving isomorphism
I’m thinking of defining canonical serial as the GEDCOM format derived from semantics but treating it as an image of reify. The isomorphism can be
defined asIso Sem Canon, where Canon is the sum typeΣ Sem P. For eachs, we need to show equality by proving
that the proposition P holds. This simplifies to equality for the first component.
#Defining isomorphism between serial and canonical structure
I remember a classical pattern — when extending a structure with a property, the resulting Σ-type becomes isomorphic to the original. In cubical Agda,
you can do this without postulates, using theΣ≡Proplemma. For canonical serial, we might defineCanonasΣ Serial WF, and work on creating an isomorphism between semantics and canon ifWFalways holds.
#Defining a canonical serial structure and isomorphism
I was thinking of definingCanonwithrawas the GEDCOM lines andsemas the semantic graph, plus a proof thatrawencodessem. However, I realized that an isomorphism betweenSemGraphandCanonmight not hold
becauserawis data, not a property. Maybe another option would be to considerCanonas encoding types of semantic graphs,
ensuring there’s exactly one serial form per semantic graph, which would define the isomorphism.
#Defining semantic graph and canonical serial relationship
I want to think about how we map a “record-of-arrays” to an “array-of-records,” which is invertible and canonical. We can present this as a canonical serial, then explain how it maps to GEDCOM lines (invertible with correct ordering). So, let’s define:
Semas a record-of-relations, potentially using maps with Id.Serialas a flat adjacency list (edge list) with Fin n indexing, plus attributes. This helps show the isomorphism between the graph and the adjacency list.
#Defining pipeline structure for genealogical data
The process seems to involve parsing GEDCOM into a semantic graph, then encoding that into canonical serial. But here’s a thought: perhaps the “free tree” precedes the canonical serial and semantic graph. The free tree represents raw data, canonically encoded into serial, then decoded into the graph. We can treat the canonical serial as deterministic encoding, and the semantic graph as Id-based. This could be a solid design.
#Breaking down the genealogical data pipeline
Okay, let’s break this into clear layers for better understanding:
- Layer 1: Convert GEDCOM into a free tree (basically, going from syntax to structure with dummy examples).
- Layer 2: Next, we turn the free tree into a semantic graph (meaning-level), with proper reflections.
- Layer 3: Transform the semantic graph into a canonical serial (reified as an indexed adjacency list).
- Layer 4: Make the whole pipeline bidirectional using isoToPath and pathToIso, with proofs in place.
This way we can demonstrate the process clearly.
#Creating loops for demonstrative examples
To show the loops:
- Convert
Iso SemGraph CanonSerialtoSemGraph ≡ CanonSerial, then applypathToIsoover and over. - Use
Id ≡ Fin nto transport the whole graph type. - Illustrate property transport, such as acyclicity or reciprocity, as they relate to the domain.
The user wants extensive dummy examples, so I’ll defineSemExamplefrom GEDCOM data. We’ll demonstrate iso betweenSemExampleandSerialExampleas well as usepathToIso. These examples will serve as our dataset for the code.
#Crafting response structure
- Start with a JSON code block.
- Outline “GEDCOM 7 ground rules,” referencing spec sections like dataset order, line grammar, INDI and FAM structure, and bidirectional requirements.
- Provide 2-3 dummy examples of GEDCOM files with annotated features like marriages, divorces, adoptions, and placeholders.
- Present the pipeline: Define simplified types for Tag, Pointer, Pedigree, and others.
- Introduce the
SemGraph,Serial, andFreeTreetypes. - Include
WFproperty to capture spec constraints.
#Providing code structure and examples
- Provide reify and reflect functions, then build an Iso.
- Use
isoToPathto show how to transport proofs and values. - Demonstrate path-to-iso conversion functions, mentioning the inverse of transport.
- Use case examples: reification with
Fin, graph type conversions betweenSemGraph IdandSemGraph Fin. - Explore advanced patterns, like
substIso,transportIsoToPath, andcodomainIso. - Include an Alternative Approach using
Equivor SIP. Provide code and GitHub links, referencing the recipes A-F.
#Ensuring citations and Agda code structure
I’ll gather all relevant citations for the spec statements. Each one will have its citation related to specific aspects like dataset order, pointers,
INDI structures, and more. These will be included after relevant paragraphs, but not inside code blocks. I’ll make sure the code snippet uses the correct
syntax, mentions practical recipes, and ensures proper type checks, starting with the{-# OPTIONS --cubical #-}declaration.
#Defining GEDCOM types and handling tags
I’ll define the module with required dependencies and start by defining basic data types likeTag, representing GEDCOM tags — we’ll assumeStringrepresentation to keep things flexible. Later, …
Continuation from: “representation to keep things flexible. Later, …”
#Defining key structures
I’m moving forward with defining thePersonCoreandSemFamilyrecords.PersonCoreholds information on a person’s name and sex, whileSemFamilyencapsulates family relationships, including IDs for husband, wife, and children, alongside pedigree
information. No need to cite these details as they’re foundational for the task at hand!
#Defining semantic graph and serial records
I’m setting up aSemGraphto represent the relationships between people and families,
essentially a semantic graph. TheFamRecandIndiRecrecords capture family
and individual data akin to GEDCOM format. TheSerialrecords bring all this together,
providing a complete dataset. Finally, a well-formedness predicate will enforce bidirectional pointers
between entities.
#Defining well-formedness (WF)
I’m outlining theWFrecord to represent the well-formedness of the family and individual
records. Right now, I’m not wanting to compute membership but want to show how the structure defines
look-up functions for individuals and families, paired with proofs for their existence. This approach
treats lists as finite maps with associated proofs, but it’s a bit more computationally complex.
#Defining a canonical serial representation
Since the user wants something broad, I’ll describe a pattern. For a more abstract demonstration, we
might not need to fully implementWF. Instead, I’ll define a canonical serial where
individuals and families are stored in functions fromIdto record, avoiding look-ups.
By using finite sets likeFin n, we’ll get an indexed, canonical serial with no duplicates.
#Defining the canonical serial with iso
The goal is to create a canonical serial wherenIndiandnFamrepresent number
of individuals and families, respectively. The mapping fromIndiIdandFamIdto
finite indices is defined using anIso. We can directly useFinto define this
serial. The key part is parameterizing semantics and showing the isomorphisms between original IDs and the
finite indices.
#Simplifying the graph representation and encoding
Instead of using heavy transformations, I’ll define the canonical serial by representing it with a nested “GEDCOM-style” format, including FAMS/FAMC pointers. This is closer to the expected file format. For index-based encoding, an additional step can be added without overcomplicating the process.
#Outlining use cases and next steps
I’m thinking of delivering several use cases:
- Round-trip between Semantic Graph and Canonical GEDCOM record with charts and bidirectional pointers.
- Reify the graph into an adjacency list, applying transport algorithms.
- Canonicalize tags schema and demonstrate ISO mapping.
- Handle multi-line notes and show conversion between representations.
The user seems to want broad, so I’ll focus on the core principles and leave some details as postulated.
#Planning the response and generating example dataset
I need to be honest: I can’t guarantee the code will compile without testing it, but we can design it using well-known library modules that should work. I’ll show the code and clarify that it might need minor tweaks. For the GEDCOM dataset, I’ll create a header with required fields and cite the appropriate sources.
#Creating GEDCOM structure and family definitions
To demonstrate, I’ll set up families with pointers, events, and roles. For example, a step-parent family with
PEDI and PHRASE attributes, an adoption event linking individuals, and godparent associations. I’ll also show
placeholders for unknown children with@VOID@. Details will be cited, and families like F1 will be
defined with marriage, divorce, and child relationships.
#Creating dummy datasets and encoding in Agda
I’ll create two dummy datasets: one showing a complex family structure with step-parents and adoption, and another showcasing equivalent semantics with different field orderings and missing optional data. The second example will demonstrate canonicalization and unique representation. I’ll encode the GEDCOM lines as a list of strings for Agda.
#Defining`Line`type and dataset representation
I’ll define aLinetype for Agda, with fields like level, tag, and value (either a pointer or text).
To handle the pointer vs string distinction, I’ll use aLineValdata type, though for now, I’ll stick
with strings for simplicity in the dummy dataset. I’ll then createrawGEDCOM : List Linerepresenting
the example dataset.
#Creating an Agda example for iso loops
I’ll defineIndiIdand …
#Defining WF and pointers
I’ll define a placeholderWFusing⊤for simplicity in this tutorial but later plan to showWFas a record, though I don’t need to implement it fully for now. For a domain-faithful example, I’ll
incorporate bidirectional pointer constraints to show explicit equalities between computed and stored pointers.
#Defining derived functions in SemGraph
We’ll defineSemGraphwithout storing direct FAMS pointers, but we can derive them easily. ForSerial, we’re storing the FAMS and FAMC lists in individuals and families as edges. To keep things general,
I’ll considerderivedFAMSandderivedFAMCas part of the WF property to avoid relying on a
specific example.
#Clarifying canonical serial vs. semantic graph
It looks like user wants the canonical serial artifact without pairing it with the semantic graph. I’m considering how to define the relationship—perhaps by reification, where the serial is the reified form of the semantic graph. Still, user asked for a “structure-property-equality” loop, so I need to reconsider how I define it.
#Defining canonical serial for iso with semantics
I’m thinking about makingCanonSerialpurely serial while still maintaining an isomorphism withSemGraph. One idea is to defineCanonSerialas a serial with a property that relates it to a
unique reified graph. Alternatively, maybe I should define it as aSemGraphwith its canonical encoding to
keep things simpler and more intuitive.
#Refining the canonical serial and pipeline
The user wants to link the “free tree → canonical
#Considering iso between serial and semantics
I’m thinking of usingΣ SemGraph (λ g → SerialEncoding g)for a trivial isomorphism with theSemGraph. To match the user’s request, I could present both approaches (A1 and A2), with one being a simple
design (Design 2). But they’ll want a true iso, not just caches, for correctness.
#Planning the user’s iso task with variations
I think I can deliver two approaches: 1) a challenging theorem-level iso between serial and semantic graph, and 2) an engineering-level iso that’s easier, between semantic graph and its encoded version. I’ll create multiple variation recipes with corresponding code snippets for each level of abstraction.
#Defining a skeleton for iso implementation
Okay, we can define the skeleton for iso without fully implementing WF. For simplicity, we’ll define:
Semas the semantic graph with persons, families, and pedigree.Serialas the canonical serial representation with family lists, resembling GEDCOM.
The goal is to establish howreifyandreflectbetweenSemandSerialserve as inverse definitions, setting the stage for clearer code.
#Aligning Family Structures and Representation
I see that usingVec Child nalong withVec Ped nworks well for aligning the child and pedigree
data. In this case, the latter is isomorphic toVec (Child × Ped) n, which matches the plan of finite indexing.
The idea is to make the canonical and semantic representations match up at each stage, with possible differences in order
between the two.
#Crafting Deliverables for GEDCOM Subset
Given the broad scope, I’ll break this down into manageable parts. The goal is to deliver multiple patterns, focusing on modular isomorphisms for substructures instead of a full GEDCOM integration.
For each feature—bi-directional pointers, step-parent, adoption, associations, extension tags, and notes—I’ll provide:
- Example GEDCOM fragment
- Semantic representation in Agda
- Canonical serial representation
- Iso and path loop snippet
- A “recipe” for generalization, including the relevant GEDCOM fragment.
#Ensuring Alternatives
The user requested dummy examples grounded in the spec to adapt recipes into a domain-faithful
Iso+isoToPath+pathToIso loop. I’ll ensure to include at least one alternative method to close the loop,
such as usingEquivanduaorTransport/Glue.
Below arerecipes you can lift into your codebasefor closing theStructure ↔ Property ↔ Equality ↔ Structureloop in Cubical Agda, specifically aroundreification/serialization of a (graph-y) genealogy structureand usingpathToIsoas the “path ⇒ concrete conversion” bridge.
#0) What`pathToIso`actually gives you (and why it’s perfect for “transport loops”)
In Cubical Agda, a path of typesp : A ≡ Balreadyisa computational coercion:transport p : A → B.pathToIsopackages that coercion into anIso A Bwith explicit inverse and the round-trip proofs.
From the library:
Iso.fun (pathToIso p) transport pIso.inv (pathToIso p) transport⁻ pIso.sec/Iso.retare the usual transport round-trips
Minimal imports (current cubical library layout):
agda Copy code {-# OPTIONS --cubical #-} module Genealogy.PathToIso.Recipes where open import Cubical.Foundations.Prelude open import Cubical.Foundations.Transport using (pathToIso ; transport ; transport⁻ ; substIso) open import Cubical.Foundations.Isomorphism using (Iso ; isoToPath ; Iso≡Set ; transportIsoToPath ; transportIsoToPath⁻ ; isoFunInjective) open import Cubical.Foundations.Univalence using (ua) open import Cubical.Foundations.Equiv using (_≃_ ; idEquiv)
#1) The “univalence/transport loop” you want, in one picture
You typically bounce between these representations:
- Structure: your rich domain type (
Genealogy)- Property: invariants/canonicity/well-formedness (
WF)- Equality: apath of types(often via univalence)
- Structureagain: transport/coercions becomeprograms(and can be extracted as
Iso)Two key library bridges:
isoToPath : Iso A B → A ≡ B(usesGlue)agda.github.iopathToIso : A ≡ B → Iso A B(uses transport/transport⁻)leibniz-stt.github.io
And the “coercion is the same as yourIso.fun” lemma:
transport (isoToPath f) x ≡ f .fun xagda.github.io
That last lemma is what makes the looppractically usablein large developments: you can write things in terms oftransport(generic rewriting) but still recover a cleanIsoAPI.
#2) Domain scaffold: genealogies are graphs, so model the “structure vs serialization” boundary explicitly
A genealogy with marriages/step-parents/laterals is not a tree. A practical pattern is:
- Use anID type
Idfor people (labels) - Keep structure as relations over
Id - Reify into afinite, index-basedencoding later (e.g.
Fin n, adjacency lists, edge lists, etc.) - Add awell-formednesspredicate for the serialized form
Here’s a skeleton that is flexible enough for parents/spouses/step-parents/etc.:
agda Copy code data Rel : Type where parent spouse stepParent sibling : Rel record Genealogy (Id : Type) : Type where field -- data about a person info : Id → Type -- a proof-relevant edge relation (use Bool if you want decidable edges) edge : Rel → Id → Id → Type open GenealogyNow define a “serialized” view (toy example: edge-list form):
agda Copy code record Serial (Id : Type) : Type where field edgeₛ : Rel → Id → Id → Type infoₛ : Id → Type WF : ∀ {Id} → Serial Id → Type WF s ⊤ -- replace with your real invariants: acyclicity constraints, symmetric spouse, etc.You can make this much more “real” by switching
IdtoFin nand using lists/vectors; the recipes below don’t depend on the specific container.
#3) Recipe A: Reification/serialization as an`Iso`, thenturn it into a type equality, then back to`Iso`via`pathToIso`
This is the most common “Structure→Property→Equality” closure:
#Step A1 — Build an isomorphism from round-trip (encode/decode) proofs
agda Copy code reify : ∀ {Id} → Genealogy Id → Σ (Serial Id) WF reflect : ∀ {Id} → Σ (Serial Id) WF → Genealogy Id reify-reflect : ∀ {Id} (s : Σ (Serial Id) WF) → reify (reflect s) ≡ s reflect-reify : ∀ {Id} (g : Genealogy Id) → reflect (reify g) ≡ g reifyIso : ∀ {Id} → Iso (Genealogy Id) (Σ (Serial Id) WF) reifyIso {Id} record { fun reify ; inv reflect ; sec reify-reflect ; ret reflect-reify }#Step A2 — Convert that`Iso`to an actualpath of types(univalence via`Glue`)
agda Copy code p-reify : ∀ {Id} → Genealogy Id ≡ Σ (Serial Id) WF p-reify isoToPath reifyIso
isoToPathis defined usingGlueandisoToEquiv.agda.github.io#Step A3 — From thatpath, recover an`Iso`whose forward map is transport
agda Copy code transportIso : ∀ {Id} → Iso (Genealogy Id) (Σ (Serial Id) WF) transportIso pathToIso p-reifyAnd you can connect it back to your “handwritten”
reify:agda Copy code reify≡transport : ∀ {Id} (g : Genealogy Id) → transport p-reify g ≡ reify g reify≡transport g sym (transportIsoToPath reifyIso g)Because
transport (isoToPath f)agrees withf.fun.agda.github.io#Why this matters in practice
Nowany theoremyou proved about
Σ (Serial Id) WFcan be “pulled back” toGenealogy Idbytransport(or by the extractedIso).
#4) Recipe B: The “round-trip correctness ⇒ equality of genealogies” workflow (your Structure→Property→Equality loop)
A very common goal: prove
g₁ ≡ g₂by comparing their canonical serialized forms.If your serialization is injective up to the invariants/canonicalization you chose, you can do:
agda Copy code canon : ∀ {Id} → Genealogy Id → Σ (Serial Id) WF canon Iso.fun reifyIso -- If the codomain is a set (often true for concrete serial formats), -- you can turn equality of serializations into equality of genealogies: genealogy-ext : ∀ {Id} (g₁ g₂ : Genealogy Id) → canon g₁ ≡ canon g₂ → g₁ ≡ g₂ genealogy-ext g₁ g₂ eq isoFunInjective reifyIso g₁ g₂ eq
isoFunInjectiveis inCubical.Foundations.Isomorphismagda.github.ioand is exactly the “reify is injective because it has an inverse” principle.#Practical genealogy meaning
- Yourcanonicalreification (sorted edges, normalized spouse symmetry, canonical ID renaming, etc.) becomes the equality decision boundary.
- “Equality of structures” becomes “equality of reified normal forms” (which is often computational / decidable).
#5) Recipe C:Alpha-equivalence / person-renamingas a path, then`pathToIso`gives the renaming program for free
This one is gold for “IDs as implementation detail”.
Suppose you have an equivalence between ID types:
agda Copy code eId : Id₀ ≃ Id₁ pId : Id₀ ≡ Id₁ pId ua eId
uais the inverse part of the univalence isomorphism betweenA ≡ BandA ≃ B.agda.github.ioNow lift that equality through your structure functor:
agda Copy code pG : Genealogy Id₀ ≡ Genealogy Id₁ pG cong Genealogy pIdFinally extract the concrete renaming functions:
agda Copy code renameGenealogy : Genealogy Id₀ → Genealogy Id₁ renameGenealogy Iso.fun (pathToIso pG) unrenameGenealogy : Genealogy Id₁ → Genealogy Id₀ unrenameGenealogy Iso.inv (pathToIso pG)#Practical genealogy meaning
This is exactly what you want when:
- you serialize with
Fin nbut later reindex people - you canonicalize by “renaming IDs into first-appearance order”
- you compare genealogies “up to relabeling”
You can do the whole relabelingas transport, butpathToIsomakes it a clean programmatic API.
#6) Recipe D: “Decomposition of a structure” (record ↔ Σ/× normal form), then reuse generic theorems on Σ-types
This matches your “decomposition of a complete genealogy tree via reification”.
Often you want acanonical decomposition typefor:
- generic recursion/induction
- generic serialization/printing
- generic equality checking
- generic “merge” or “diff” algorithms
Example: treatGenealogy Idas a Σ of its fields:
agda Copy code GenealogyΣ : (Id : Type) → Type GenealogyΣ Id Σ[ info ∈ (Id → Type) ] (Rel → Id → Id → Type) toΣ : ∀ {Id} → Genealogy Id → GenealogyΣ Id toΣ g info g , edge g fromΣ : ∀ {Id} → GenealogyΣ Id → Genealogy Id fromΣ (i , e) .info i fromΣ (i , e) .edge e toΣ-fromΣ : ∀ {Id} (x : GenealogyΣ Id) → toΣ (fromΣ x) ≡ x toΣ-fromΣ _ refl fromΣ-toΣ : ∀ {Id} (g : Genealogy Id) → fromΣ (toΣ g) ≡ g fromΣ-toΣ _ refl decompIso : ∀ {Id} → Iso (Genealogy Id) (GenealogyΣ Id) decompIso record { fun toΣ ; inv fromΣ ; sec toΣ-fromΣ ; ret fromΣ-toΣ }Now you can turn this into atype equalityand back:
agda Copy code p-decomp : ∀ {Id} → Genealogy Id ≡ GenealogyΣ Id p-decomp isoToPath decompIso decompIso' : ∀ {Id} → Iso (Genealogy Id) (GenealogyΣ Id) decompIso' pathToIso p-decompSo you can:
- write “serialization” against the Σ-normal form
- then transport the result back to the record form
(Again,isoToPathisGlue-based.agda.github.ioandpathToIsois transport-based.leibniz-stt.github.io)
#7) Recipe E: Transportingproperties(invariants) across your structure/serialization equality using`substIso`
When you have apath of typesp : A ≡ B, you frequently need not justA → B, but also
P : A → Typetransported toB → Type- “invariant types” transported along a path
Cubical.Foundations.Transportgives:
agda Copy code substIso : ∀ {ℓ ℓ'} {A : Type ℓ} (B : A → Type ℓ') {x y : A} → (p : x ≡ y) → Iso (B x) (B y) substIso B p pathToIso (cong B p)How this shows up in genealogy work
Say you have an invariant living over thecarrier type(IDs):
agda Copy code NoSelfSpouse : ∀ {Id} → Genealogy Id → Type NoSelfSpouse g (x : Id) → edge g spouse x x → ⊥If you rename IDs via
pId : Id₀ ≡ Id₁, thenpG : Genealogy Id₀ ≡ Genealogy Id₁as above, you can move invariants across:agda Copy code NoSelfSpouse-renamed : (g : Genealogy Id₀) → NoSelfSpouse g → NoSelfSpouse (transport pG g) NoSelfSpouse-renamed g inv transport (λ i → NoSelfSpouse (transport pG g)) invIn bigger developments,
substIso(andsubstEquiv) are often nicer because they give you explicit inverses and round-trip lemmas immediately.
#8) Recipe F: Proving the loop really closes —`pathToIso (isoToPath f)`is your original`Iso`(for sets)
This is the formal “Structure↔Equality↔Structure” closure.
If your types are sets (true for most concrete serialized formats), you can prove:
agda Copy code pathToIso-isoToPath : ∀ {ℓ} {A B : Type ℓ} → (hA : isSet A) → (hB : isSet B) → (f : Iso A B) → pathToIso (isoToPath f) ≡ f pathToIso-isoToPath hA hB f Iso≡Set hA hB (pathToIso (isoToPath f)) f (λ x → transportIsoToPath f x) (λ y → transportIsoToPath⁻ f y)
transportIsoToPath/transportIsoToPath⁻are in the libraryagda.github.ioIso≡Setis the “two isos are equal if their components are pointwise equal” lemma for set-level carriersagda.github.iopathToIsodefinition is transport/transport⁻leibniz-stt.github.ioisoToPathis Glue-basedagda.github.io
#Why this matters
It means you can freely move between:
- “I have an
Iso(program + laws)” - “I have a type equality
A ≡ B(rewrite/transport engine)”
…and you don’t lose computational content.
#9) How to apply these patterns to “Free Tree genealogy reification” specifically
A good architecture for your domain is:
- Syntax / Free layer A “free genealogy program” describing edits/relations (add person, add marriage, etc.). You normalize it (sort edges, canonicalize IDs, merge duplicates, etc.).
- Semantic layer
A structured
Genealogy Idwith invariants. - Serialization layer
A concrete
Serial(lists/vectors/JSON AST) +WF.
Then you prove anIsobetween semantic and canonical serialized layers:
Iso (Genealogy Id) (Σ SerialCanon WF)- turn it into a path:
isoToPath - use
pathToIsoto extract coercionsas transport programswhen you want to rewrite types generically (esp. dependent proofs)
The key trick is:make your serialization canonical(or define a canonical quotient) so that “reify equality” corresponds to “structure equality”.
#Alternative approaches (often useful alongside`pathToIso`)
- Use
pathToEquiv+ univalence directly Sometimes you wantEquivinstead ofIso(especially for higher structure, or to compose equivalences). The univalence module exposes that(A ≡ B) ≃ (A ≃ B)is itself an equivalence (viapathToEquivandua).agda.github.io Then:p : A ≡ B→pathToEquiv p : A ≃ Be : A ≃ B→ua e : A ≡ B
- Use the Structure Identity Principle (SIP)
For genealogies-as-structures-on-a-carrier, SIP can let you avoid hand-building
isoToPatharguments: you show your structure notion is “univalent”, then equality of structured objects corresponds to structure-preserving equivalences. (The cubical library hasCubical.Foundations.SIP.)leibniz-stt.github.io
#Related GitHub repositories / codebases to mine for patterns
- agda/cubical— the Cubical standard library where
pathToIso,isoToPath,ua,SIPlive.GitHub - agda/agda— the compiler + Cubical mode implementation and docs.agda.readthedocs.io
- cqts/introduction-to-cubical(course notes site repo) — lots of small, copyable “transport/univalence” patterns (great for building intuition and small lemmas you’ll reuse).cqts.github.io
If you want, paste atinyfragment of your current genealogy types (even just yourPersonId, your edge representation, and your current serialization target), and I’ll adapt Recipe A/B into a compiling, domain-faithfulIso+isoToPath+pathToIsoloop that matches your exact “free tree → canonical serial → semantic graph” pipeline.
#What`pathToIso`is doing for you in the genealogy setting
In Cubical Agda,pathToIsois the “computational back-edge” of the loop:
- You prove aPath(a type equality) because your domain argument says two representations are “the same thing”.
- You then wantactual functionsto convert values back and forth.
pathToIsoturns that type equality into anIsoso you can extractfun/invconversion functions and use them in programs.
The two glue pieces that make this practical (and not just philosophical):
transportIsoToPath : (f : Iso A B) (x : A) → transport (isoToPath f) x ≡ f .fun xlets you replace “mysterious transport” with your actual conversion function.agda-libraries- Forset-level types, there’s a tight relationship between
Iso A BandA ≡ B(soisoToPathandpathToIsobehave like inverses up to the right notion).agda-libraries
That’s the core loop you want for “free tree → canonical serial → semantic graph”:
Structure(representation) → prove aPropertyabout it → turn equivalence intoEquality→
transportthe property → recover a newStructureconversion bypathToIso.
#GEDCOM concepts we’ll explicitly model (subset)
The dummy examples and the Agda recipes below are grounded in these GEDCOM v7 ideas:
- A GEDCOM dataset is a header, followed by 0+ records, ending with a trailer.The FamilySearch GEDCOM Specifi…
- Pointers are either a real cross-reference pointer or the special
@VOID@placeholder (“void pointer”).The FamilySearch GEDCOM Specifi…@VOID@is used when data is removed but structure/pointers must remain (export/round-trip friendliness).The FamilySearch GEDCOM Specifi…@VOID@can also appear as a placeholder for an unknown child in aFAMrecord.
- The core linkage is between
INDI(individual) andFAM(family) records, and linkages are expected to be consistent in both directions:FAMusesHUSB,WIFE,CHILpointers intoINDI.The FamilySearch GEDCOM Specifi…- The pointed-to
INDIrecords should point back usingFAMS(as spouse) andFAMC(as child).The FamilySearch GEDCOM Specifi…
- Pedigree / child-family status uses
PEDI(enum:ADOPTED,BIRTH,FOSTER,SEALING,OTHERwith aPHRASE).The FamilySearch GEDCOM Specifi… - Adoption can be described with
ADOP+FAMCand anADOPenum that says which spouse adopted (HUSB,WIFE,BOTH).The FamilySearch GEDCOM Specifi… - Association (laterals, e.g. godparents) uses
ASSOwithROLE(e.g.GODP).The FamilySearch GEDCOM Specifi… - Documented extension tags are tied to meaning by aURIin
HEAD.SCHMA.TAG, and semantic equivalence is by URI, not by the literal tag name.The FamilySearch GEDCOM Specifi… - GEDCOM explicitly allows alternative modeling choices (e.g., separation/reunion can be one family with multiple events vs multiple families with the same spouses).The FamilySearch GEDCOM Specifi…
These are exactly the “domain states of affairs” you can turn into Paths, and then intoIsoviapathToIso.
#Extensive dummy GEDCOM fragments
#Example 1 — Step-family + laterals + void-child placeholder + documented extensions + marriage age
This is a single GEDCOM dataset (header + records + trailer). The constructs used are consistent with:
- dataset structureThe FamilySearch GEDCOM Specifi…
FAMandINDIlinkageThe FamilySearch GEDCOM Specifi…- backlink expectationThe FamilySearch GEDCOM Specifi…
@VOID@placeholder child- association role
GODPThe FamilySearch GEDCOM Specifi… - marriage event with wife age substructureThe FamilySearch GEDCOM Specifi…
- documented extension tags mapped via
SCHMA.TAGto URIsThe FamilySearch GEDCOM Specifi… FAMC+PEDIfor birth/other pedigree linksThe FamilySearch GEDCOM Specifi…
gedcom Copy code 0 HEAD 1 GEDC 2 VERS 7.0 2 FORM LINEAGE-LINKED 1 SCHMA 2 TAG _SKYPEID http://example.org/uri/contact/skype 2 TAG _CID http://example.org/uri/contact/customer-id 0 @I1@ INDI 1 NAME Joao /Silva/ 1 SEX M 1 BIRT 2 DATE 14 FEB 1980 2 PLAC Sao Paulo, Brazil 1 FAMS @F1@ 1 _SKYPEID joao.silva 0 @I2@ INDI 1 NAME Maria /Souza/ 1 SEX F 1 FAMS @F1@ 1 FAMS @F2@ 1 _CID CUST-00991 0 @I3@ INDI 1 NAME Ana /Silva/ 1 SEX F 1 BIRT 2 DATE 20 APR 2005 2 PLAC Sao Paulo, Brazil 2 FAMC @F1@ 1 FAMC @F1@ 2 PEDI BIRTH 1 FAMC @F2@ 2 PEDI OTHER 3 PHRASE Step family (moved 2010) 1 ASSO @I4@ 2 ROLE GODP 0 @I4@ INDI 1 NAME Carlos /Pereira/ 1 SEX M 1 FAMS @F2@ 0 @F1@ FAM 1 HUSB @I1@ 1 WIFE @I2@ 1 MARR 2 DATE 01 JAN 2000 2 PLAC Sao Paulo, Brazil 2 WIFE 3 AGE 22y 1 CHIL @I3@ 0 @F2@ FAM 1 HUSB @I2@ 1 WIFE @I4@ 1 MARR 2 DATE 01 JUN 2010 1 CHIL @I3@ 1 CHIL @VOID@ 2 PHRASE Unknown infant 0 TRLR#Why this example isusefulfor`pathToIso`
It contains two distinct “canonicalization” pressures:
- Schema normalization:
_SKYPEID/_CIDare meaningful by URI, not by tag spelling.The FamilySearch GEDCOM Specifi…- Graph normalization: family links are a graph with intended bidirectional consistency.The FamilySearch GEDCOM Specifi…
Both of these become “proof obligations” you can use to build Paths between types of representations.
#Example 2 — Adoption modeled with`ADOP`+`FAMC`+ “who adopted” enum
Grounded in the adoption structure and
ADOPenum semantics.The FamilySearch GEDCOM Specifi…gedcom Copy code 0 HEAD 1 GEDC 2 VERS 7.0 2 FORM LINEAGE-LINKED 0 @I10@ INDI 1 NAME Beatriz /Lima/ 1 SEX F 0 @I11@ INDI 1 NAME Rafael /Lima/ 1 SEX M 0 @I12@ INDI 1 NAME Clara /Lima/ 1 SEX F 1 FAMC @F10@ 2 PEDI ADOPTED 1 ADOP 2 DATE 12 DEC 2012 2 FAMC @F10@ 3 ADOP WIFE 0 @F10@ FAM 1 HUSB @I11@ 1 WIFE @I10@ 1 CHIL @I12@ 1 MARR 2 DATE 10 OCT 2000 0 TRLRInterpretation pressure:
- You may want a canonical semantic representation where “adopted-by-wife” becomes a normalized edge or an event attached to the child or family.
- There are multiple plausible internal models; proving they’re equivalent is exactly where
pathToIsobecomes a workhorse.
#Example 3 — Two different extension-tag spellings, same URI semantics
GEDCOM’s spec states that the meaning of a documented extension tag is identified by URI, so different tag spellings can be semantically equivalent if they map to the same URI in the schema.The FamilySearch GEDCOM Specifi…
Doc A:
gedcom Copy code 0 HEAD 1 SCHMA 2 TAG _SKYPEID http://example.org/uri/contact/skype 0 @I1@ INDI 1 NAME Joao /Silva/ 1 _SKYPEID joao.silva 0 TRLRDoc B (renamed tag):
gedcom Copy code 0 HEAD 1 SCHMA 2 TAG _SKP http://example.org/uri/contact/skype 0 @I1@ INDI 1 NAME Joao /Silva/ 1 _SKP joao.silva 0 TRLRThis is a perfect “state of affairs → Path” story:
- State of affairs:
_SKYPEIDand_SKPare different surface tags, but schema says both mean the same URI.The FamilySearch GEDCOM Specifi… - Canonical serial: normalize extensions to URI keys.
- Equality: “these two extension-tag universes are the same up to schema equivalence”.
- Extract Iso: use
pathToIsoto get actual renaming functions you can use in your reifier/serializer.
#Example 4 — Redaction/removal preserved structurally via`@VOID@`
GEDCOM v7 explicitly notes that pointers to removed structures should be replaced with@VOID@, and structures may remain but with empty payloads so exports can round-trip.The FamilySearch GEDCOM Specifi…
gedcom Copy code 0 @F20@ FAM 1 HUSB @VOID@ 1 WIFE @I2@ 1 CHIL @I3@ 1 MARR 2 DATE 2 PLACThe “domain pressure” here is: your semantic graph probably wantsexplicit unknowns(not “missing fields”), and your canonical serial wants a stable placeholder you can round-trip.
@VOID@pushes you toward a dedicated pointer type in Agda (Ptr A void | ref a), and you’ll wantIsobetween multiple internal encodings of “optional reference”.#Agda modeling aligned to your pipeline
We’ll set up three layers:
- Free tree: close to GEDCOM record structure; local extension tags; XREF-like IDs.
- Canonical serial: same information but normalized:
- extension keys normalized to URIs (schema meaning)The FamilySearch GEDCOM Specifi…
- IDs reindexed to a canonical finite index type (e.g.
Fin n)- Semantic graph: your “meaningful” relation-level view (parents/spouses/associations), derived from records and free of spelling/ordering artifacts.
Below is a compact, domain-faithful “core” that is enough to demonstrate
Iso ↔ isoToPath ↔ pathToIsoon real GEDCOM-like content.#Core Cubical tools we’ll rely on
From the Cubical library (as listed in your
agda-libraries.txt):
Iso,isoToPath,pathToIso,compIso,invIso, plus Π-type iso helpers likecodomainIsoDep.agda-librariestransportand friends (includingtransport⁻).agda-libraries- The crucial bridge lemma
transportIsoToPath.agda-libraries
#Concrete Cubical Agda snippets
#0) Minimal GEDCOM-faithful core types
agda Copy code {-# OPTIONS --cubical #-} module Genealogy.GEDCOM.PathToIso where open import Cubical.Foundations.Prelude open import Cubical.Foundations.Isomorphism open import Cubical.Foundations.Transport open import Cubical.Data.Sigma open import Cubical.Data.List as List using (List; []; _∷_; map) open import Cubical.Data.Maybe as Maybe using (Maybe; nothing; just) open import Cubical.Data.Unit using (Unit; tt) open import Cubical.Data.Empty as ⊥ using (⊥) open Iso -- lets us write `fun`, `inv`, `sec`, `ret` -- GEDCOM pointer: either a real reference or the void pointer (@VOID@). data Ptr {ℓ : Level} (A : Type ℓ) : Type ℓ where void : Ptr A ref : A → Ptr A mapPtr : ∀ {ℓ ℓ'} {A : Type ℓ} {B : Type ℓ'} → (A → B) → Ptr A → Ptr B mapPtr f void void mapPtr f (ref a) ref (f a) -- A tiny subset of GEDCOM enums we actually use in the examples: data Sex : Type where M F U : Sex data Pedi : Type where ADOPTED BIRTH FOSTER SEALING : Pedi OTHER : (phrase : String) → Pedi data AdopBy : Type where HUSB WIFE BOTH : AdopBy data Role : Type where GODP : Role -- godparent example (ASSO/ROLE) record Date : Type where constructor mkDate ; field txt : String record Place : Type where constructor mkPlace ; field txt : String record Age : Type where constructor mkAge ; field txt : String -- Individual events: enough to show BIRT with optional FAMC, and ADOP with FAMC + who. data IndiEvent (F : Type) : Type where BIRT : Maybe Date → Maybe Place → Maybe F → IndiEvent F ADOP : Maybe Date → F → Maybe AdopBy → IndiEvent F data FamEvent : Type where MARR : Maybe Date → Maybe Place → Maybe Age → FamEvent -- keep one age for demo DIV : Maybe Date → Maybe Place → FamEvent -- INDI record core (subset): names, sex, family links, associations, events, extensions. record IndiRec (P F Ext : Type) : Type where constructor indi field names : List String sex : Maybe Sex famc : List (F × Pedi) -- INDI.FAMC with PEDI (step/adopt/etc.) fams : List F -- INDI.FAMS asso : List (P × Role) -- ASSO @INDI@ / ROLE ... events : List (IndiEvent F) ext : List (Ext × String) -- extension payloads (keyed by Ext) -- FAM record core (subset): partners, children (may include void), events, extensions. record FamRec (P Ext : Type) : Type where constructor fam field husb : Ptr P wife : Ptr P chil : List (Ptr P) events : List FamEvent ext : List (Ext × String) -- Dataset: a forest of records, keyed by IDs. Dataset : (P F Ext : Type) → Type Dataset P F Ext (List (P × IndiRec P F Ext)) × (List (F × FamRec P Ext))How this matches the spec (conceptually):
Ptrcorresponds to GEDCOM pointer syntax, including@VOID@.The FamilySearch GEDCOM Specifi…famcincludesPEDIincludingOTHER + PHRASE, matching thePEDIenum guidance.The FamilySearch GEDCOM Specifi…ASSOwithROLEsuch asGODPis explicitly shown in the spec examples.The FamilySearch GEDCOM Specifi…- Family links (HUSB/WIFE/CHIL) and backlink expectations are captured by storing both sides; we’ll define a property to check consistency.The FamilySearch GEDCOM Specifi…
#1) Iso combinators you’ll use constantly
These are the “recipe ingredients” for lifting anIsoon IDs/tags into anIsoon whole datasets:
agda Copy code -- Iso over lists by mapping. ListIso : ∀ {ℓ ℓ'} {A : Type ℓ} {B : Type ℓ'} → Iso A B → Iso (List A) (List B) fun (ListIso i) map (fun i) inv (ListIso i) map (inv i) sec (ListIso i) [] refl sec (ListIso i) (x ∷ xs) cong (λ x' → x' ∷ map (inv i) (map (fun i) xs)) (sec i x) ∙ cong (λ xs' → x ∷ xs') (sec (ListIso i) xs) ret (ListIso i) [] refl ret (ListIso i) (y ∷ ys) cong (λ y' → y' ∷ map (fun i) (map (inv i) ys)) (ret i y) ∙ cong (λ ys' → y ∷ ys') (ret (ListIso i) ys) PtrIso : ∀ {ℓ ℓ'} {A : Type ℓ} {B : Type ℓ'} → Iso A B → Iso (Ptr A) (Ptr B) fun (PtrIso i) mapPtr (fun i) inv (PtrIso i) mapPtr (inv i) sec (PtrIso i) void refl sec (PtrIso i) (ref x) cong ref (sec i x) ret (PtrIso i) void refl ret (PtrIso i) (ref y) cong ref (ret i y) MaybeIso : ∀ {ℓ ℓ'} {A : Type ℓ} {B : Type ℓ'} → Iso A B → Iso (Maybe A) (Maybe B) fun (MaybeIso i) nothing nothing fun (MaybeIso i) (just a) just (fun i a) inv (MaybeIso i) nothing nothing inv (MaybeIso i) (just b) just (inv i b) sec (MaybeIso i) nothing refl sec (MaybeIso i) (just a) cong just (sec i a) ret (MaybeIso i) nothing refl ret (MaybeIso i) (just b) cong just (ret i b) -- Product Iso (pairs). ×Iso : ∀ {ℓ₁ ℓ₂ ℓ₁' ℓ₂'} {A : Type ℓ₁} {B : Type ℓ₂} {A' : Type ℓ₁'} {B' : Type ℓ₂'} → Iso A A' → Iso B B' → Iso (A × B) (A' × B') fun (×Iso i j) (a , b) fun i a , fun j b inv (×Iso i j) (a' , b') inv i a' , inv j b' sec (×Iso i j) (a , b) k sec i a k , sec j b k ret (×Iso i j) (a' , b') k ret i a' k , ret j b' kWhy these matter for your pipeline:
- Reindexing people/families is “just” an
Isoon ID types. - Schema normalization (local tag → URI key) is “just” an
Isoon extension-key types. - Everything else isstructural lifting:
ListIso,PtrIso,×Iso, etc.
(These combinator patterns are exactly the kind of building blocks the Cubical library itself uses for Π/Σ types and structured transport.agda-libraries)
#2) Lifting`Iso`through GEDCOM record cores
agda Copy code mapIndiEvent : ∀ {F F'} → (F → F') → IndiEvent F → IndiEvent F' mapIndiEvent f (BIRT d p mf) BIRT d p (Maybe.map f mf) mapIndiEvent f (ADOP d fam who) ADOP d (f fam) who IndiEventIso : ∀ {F F'} → Iso F F' → Iso (IndiEvent F) (IndiEvent F') fun (IndiEventIso i) e mapIndiEvent (fun i) e inv (IndiEventIso i) e mapIndiEvent (inv i) e sec (IndiEventIso i) (BIRT d p mf) cong (BIRT d p) (MaybeIso i .sec mf) sec (IndiEventIso i) (ADOP d fam who) cong (λ fam' → ADOP d fam' who) (sec i fam) ret (IndiEventIso i) (BIRT d p mf) cong (BIRT d p) (MaybeIso i .ret mf) ret (IndiEventIso i) (ADOP d fam who) cong (λ fam' → ADOP d fam' who) (ret i fam) IndiRecIso : ∀ {P P' F F' Ext Ext'} → Iso P P' → Iso F F' → Iso Ext Ext' → Iso (IndiRec P F Ext) (IndiRec P' F' Ext') fun (IndiRecIso iP iF iE) r indi (IndiRec.names r) (IndiRec.sex r) (map (λ (f , pedi) → fun iF f , pedi) (IndiRec.famc r)) (map (fun iF) (IndiRec.fams r)) (map (λ (p , role) → fun iP p , role) (IndiRec.asso r)) (map (fun (IndiEventIso iF)) (IndiRec.events r)) (map (λ (k , v) → fun iE k , v) (IndiRec.ext r)) inv (IndiRecIso iP iF iE) r indi (IndiRec.names r) (IndiRec.sex r) (map (λ (f , pedi) → inv iF f , pedi) (IndiRec.famc r)) (map (inv iF) (IndiRec.fams r)) (map (λ (p , role) → inv iP p , role) (IndiRec.asso r)) (map (inv (IndiEventIso iF)) (IndiRec.events r)) (map (λ (k , v) → inv iE k , v) (IndiRec.ext r)) -- Proofs are routine by record-path + list induction; omitted here for readability. -- In real code you can prove them field-by-field using `ListIso`/`MaybeIso`/`IndiEventIso`. postulate IndiRecIso-sec : ∀ {P P' F F' Ext Ext'} (iP : Iso P P') (iF : Iso F F') (iE : Iso Ext Ext') → (r : IndiRec P F Ext) → inv (IndiRecIso iP iF iE) (fun (IndiRecIso iP iF iE) r) ≡ r IndiRecIso-ret : ∀ {P P' F F' Ext Ext'} (iP : Iso P P') (iF : Iso F F') (iE : Iso Ext Ext') → (r : IndiRec P' F' Ext') → fun (IndiRecIso iP iF iE) (inv (IndiRecIso iP iF iE) r) ≡ r sec (IndiRecIso iP iF iE) IndiRecIso-sec iP iF iE ret (IndiRecIso iP iF iE) IndiRecIso-ret iP iF iE FamRecIso : ∀ {P P' Ext Ext'} → Iso P P' → Iso Ext Ext' → Iso (FamRec P Ext) (FamRec P' Ext') fun (FamRecIso iP iE) r fam (fun (PtrIso iP) (FamRec.husb r)) (fun (PtrIso iP) (FamRec.wife r)) (fun (ListIso (PtrIso iP)) (FamRec.chil r)) (FamRec.events r) (map (λ (k , v) → fun iE k , v) (FamRec.ext r)) inv (FamRecIso iP iE) r fam (inv (PtrIso iP) (FamRec.husb r)) (inv (PtrIso iP) (FamRec.wife r)) (inv (ListIso (PtrIso iP)) (FamRec.chil r)) (FamRec.events r) (map (λ (k , v) → inv iE k , v) (FamRec.ext r)) postulate FamRecIso-sec : ∀ {P P' Ext Ext'} (iP : Iso P P') (iE : Iso Ext Ext') → (r : FamRec P Ext) → inv (FamRecIso iP iE) (fun (FamRecIso iP iE) r) ≡ r FamRecIso-ret : ∀ {P P' Ext Ext'} (iP : Iso P P') (iE : Iso Ext Ext') → (r : FamRec P' Ext') → fun (FamRecIso iP iE) (inv (FamRecIso iP iE) r) ≡ r sec (FamRecIso iP iE) FamRecIso-sec iP iE ret (FamRecIso iP iE) FamRecIso-ret iP iEWhy I allowed
postulatehere: The proofs are straightforward but verbose (list induction + record extensionality). In “real code”, you either (a) prove these field-by-field usingListIsoandMaybeIso, or (b) encode the records as nested Σ/products and letΣ-cong-iso/×Isocompose proofs mechanically.agda-librariesIf you want, you can replace these postulates by explicit proofs once you settle your final record shapes; the recipe pattern won’t change.
#3) The dataset Iso (the heart of reindexing + schema normalization)
agda Copy code DatasetIso : ∀ {P P' F F' Ext Ext'} → Iso P P' → Iso F F' → Iso Ext Ext' → Iso (Dataset P F Ext) (Dataset P' F' Ext') DatasetIso iP iF iE ×Iso (ListIso (×Iso iP (IndiRecIso iP iF iE))) (ListIso (×Iso iF (FamRecIso iP iE)))This is the practical “lift” you will use everywhere:
- If you can build
IsobetweenID types(XREF renaming/reindexing), andIsobetweenextension key types(tag spelling ↔ URI), you immediately get anIsobetween whole genealogies.
#Now: the actual`Iso → Path → Iso`loop on a GEDCOM-grounded example
#Example IDs and extension schemas
We’ll model Example 1 with finite IDs (think@I1@,@F1@etc):
agda Copy code data IRef : Type where I1 I2 I3 I4 : IRef data FRef : Type where F1 F2 : FRef -- Two different *surface* extension-tag alphabets: data ExtA : Type where SKYPEID CID : ExtA -- corresponds to _SKYPEID, _CID data ExtB : Type where SKP CID' : ExtB -- corresponds to _SKP, _CID -- Canonical extension keys (URIs), treated as a small finite universe here. data ExtURI : Type where skypeURI customerIdURI : ExtURI schemaA : Iso ExtA ExtURI fun schemaA SKYPEID skypeURI fun schemaA CID customerIdURI inv schemaA skypeURI SKYPEID inv schemaA customerIdURI CID sec schemaA SKYPEID refl sec schemaA CID refl ret schemaA skypeURI refl ret schemaA customerIdURI refl schemaB : Iso ExtB ExtURI fun schemaB SKP skypeURI fun schemaB CID' customerIdURI inv schemaB skypeURI SKP inv schemaB customerIdURI CID' sec schemaB SKP refl sec schemaB CID' refl ret schemaB skypeURI refl ret schemaB customerIdURI reflThis is exactly the “documented extension tags meaning-by-URI” mechanism in the spec.The FamilySearch GEDCOM Specifi…
#The example dataset (core reification of Example 1)
agda Copy code open IndiRec open FamRec open Date open Place open Age docA : Dataset IRef FRef ExtA docA ( (I1 , indi ("Joao /Silva/" ∷ []) (just M) [] (F1 ∷ []) [] (BIRT (just (mkDate "14 FEB 1980")) (just (mkPlace "Sao Paulo, Brazil")) nothing ∷ []) ((SKYPEID , "joao.silva") ∷ []) ) ∷ (I2 , indi ("Maria /Souza/" ∷ []) (just F) [] (F1 ∷ F2 ∷ []) [] [] ((CID , "CUST-00991") ∷ []) ) ∷ (I3 , indi ("Ana /Silva/" ∷ []) (just F) ((F1 , BIRTH) ∷ (F2 , OTHER "Step family (moved 2010)") ∷ []) [] ((I4 , GODP) ∷ []) (BIRT (just (mkDate "20 APR 2005")) (just (mkPlace "Sao Paulo, Brazil")) (just F1) ∷ []) [] ) ∷ (I4 , indi ("Carlos /Pereira/" ∷ []) (just M) [] (F2 ∷ []) [] [] [] ) ∷ [] ) , ( (F1 , fam (ref I1) (ref I2) (ref I3 ∷ []) (MARR (just (mkDate "01 JAN 2000")) (just (mkPlace "Sao Paulo, Brazil")) (just (mkAge "22y")) ∷ []) [] ) ∷ (F2 , fam (ref I2) (ref I4) (ref I3 ∷ void ∷ []) (MARR (just (mkDate "01 JUN 2010")) nothing nothing ∷ []) [] ) ∷ [] )This matches the spirit of:
FAMstructure withHUSB/WIFE/CHILpointers and@VOID@placeholder child.The FamilySearch GEDCOM Specifi…FAMC+PEDIencoding birth vs “other (step)” with a phrase.The FamilySearch GEDCOM Specifi…ASSO+ROLE GODP.The FamilySearch GEDCOM Specifi…MARRwith wife age substructure (we modeled only one age field).The FamilySearch GEDCOM Specifi…
#Canonicalization step 1: normalize extension tags to URIs
Build theIsobetween the dataset that uses surface tags (ExtA) and the dataset that uses URI keys (ExtURI):
agda Copy code -- ID isos are identity here (we are only normalizing extension keys). idIso : ∀ {ℓ} {A : Type ℓ} → Iso A A fun idIso x x inv idIso x x sec idIso x refl ret idIso x refl docA→URIIso : Iso (Dataset IRef FRef ExtA) (Dataset IRef FRef ExtURI) docA→URIIso DatasetIso idIso idIso schemaANow thePathbetween those two dataset types is:
agda Copy code docA→URIPath : Dataset IRef FRef ExtA ≡ Dataset IRef FRef ExtURI docA→URIPath isoToPath docA→URIIsoAnd now the key move:
agda Copy code docA→URIIso' : Iso (Dataset IRef FRef ExtA) (Dataset IRef FRef ExtURI) docA→URIIso' pathToIso docA→URIPathThis is a full
Iso → Path → Isoloop.In practice, this is how youcomplete the Structure↔Equality loop:
- Use
isoToPathonce to get a definitional bridge for transport/rewriting. - Later, if you only have the equality (maybe from SIP or from a larger equivalence proof), recover functions with
pathToIso.
#Transport computation: get rid of mysterious transports
If you want to transport avalueacross the equality, transport gives you a value in the canonical type.
But you’ll want it to compute to your conversion function. That’s where the lemma matters:agda-libraries
agda Copy code docA-normalized : Dataset IRef FRef ExtURI docA-normalized transport docA→URIPath docA docA-normalized≡fun : transport docA→URIPath docA ≡ fun docA→URIIso docA docA-normalized≡fun transportIsoToPath docA→URIIso docAThis is the key “practicality hook”: you can do formal reasoning via
transport, but still compute/execute thefunpart of yourIso.#Recipe A–F: Domain “state of affairs” → Path →`pathToIso`→ usable conversions
Below, each recipe is written as arepeatable patternyou can plug into your actual free-tree GEDCOM reification pipeline.
#Recipe A — Optional pointers vs`@VOID@`pointers
#State of affairs (GEDCOM grounded)
GEDCOM distinguishes “no pointer” (optional substructure absent) from “pointer present but redacted/unknown” using
@VOID@.The FamilySearch GEDCOM Specifi…The FamilySearch GEDCOM Specifi…In code, you’ll often havetwo internal designs:
- Design 1:
Maybe Pfor spouse/child refs - Design 2:
Ptr Pwithvoid | ref p
#Build the Path and extract the Iso
agda Copy code Maybe≃Ptr : ∀ {ℓ} {A : Type ℓ} → Iso (Maybe A) (Ptr A) fun Maybe≃Ptr nothing void fun Maybe≃Ptr (just a) ref a inv Maybe≃Ptr void nothing inv Maybe≃Ptr (ref a) just a sec Maybe≃Ptr nothing refl sec Maybe≃Ptr (just a) refl ret Maybe≃Ptr void refl ret Maybe≃Ptr (ref a) refl Maybe≡Ptr : ∀ {ℓ} {A : Type ℓ} → Maybe A ≡ Ptr A Maybe≡Ptr isoToPath Maybe≃Ptr Maybe≃Ptr' : ∀ {ℓ} {A : Type ℓ} → Iso (Maybe A) (Ptr A) Maybe≃Ptr' pathToIso Maybe≡Ptr#Practical GEDCOM use
- When redacting a spouse ID, you canpreserve structureby turning
just pintovoidrather than dropping the field, matching GEDCOM’s intent for round-tripping redactions.The FamilySearch GEDCOM Specifi… - For
FAM.CHIL,voidis a first-class placeholder for “unknown child” (which GEDCOM explicitly illustrates).
#Recipe B — Documented extension tags: rename surface tags by URI semantics
#State of affairs (GEDCOM grounded)
Meaning of a documented extension tag is identified by URI. Two documents can use different tag spellings but be semantically equivalent if schema maps them to the same URI.The FamilySearch GEDCOM Specifi…
#Build the Path at the “extension key universe” level
We already builtschemaA : Iso ExtA ExtURIandschemaB : Iso ExtB ExtURI.
So we get anIsobetweenExtAandExtB“through meaning”:
agda Copy code ExtA≃ExtB : Iso ExtA ExtB ExtA≃ExtB compIso schemaA (invIso schemaB)Now lift it to whole datasets (IDs fixed):
agda Copy code DocA≃DocB : Iso (Dataset IRef FRef ExtA) (Dataset IRef FRef ExtB) DocA≃DocB DatasetIso idIso idIso ExtA≃ExtB DocA≡DocB : Dataset IRef FRef ExtA ≡ Dataset IRef FRef ExtB DocA≡DocB isoToPath DocA≃DocB DocA≃DocB-fromPath : Iso (Dataset IRef FRef ExtA) (Dataset IRef FRef ExtB) DocA≃DocB-fromPath pathToIso DocA≡DocB#Practical GEDCOM use
This isexactly“free tree → canonical serial” for extensions:
- Free tree: keep raw extension tags (
_SKYPEID,_CID, …) plus schema. - Canonical serial: normalize extensions by URI key (so two files with different tag spellings merge cleanly).
This is how you avoid “schema spelling noise” preventing equality.
#Recipe C — XREF reindexing: “string IDs” → finite indices (canonical)
#State of affairs (domain + GEDCOM)
GEDCOM uses cross-reference IDs (XREFs) for records; canonical serialization often wants a compact finite index space (likeFin n), which you mentioned in the pattern.The FamilySearch GEDCOM Specifi…
In practice, you pick an ordering ofINDI/FAMrecords and produce anIsobetween:
- surface ID type (e.g., parsed XREFs)
- canonical index type
#The key pattern
GivenIsoon person IDs and family IDs, you lift them to anIsoon the entire dataset usingDatasetIso.
For the demo, use a toy index type (you can replace withFin nlater):
agda Copy code data PIx : Type where p0 p1 p2 p3 : PIx data FIx : Type where f0 f1 : FIx IRef≃PIx : Iso IRef PIx fun IRef≃PIx I1 p0 fun IRef≃PIx I2 p1 fun IRef≃PIx I3 p2 fun IRef≃PIx I4 p3 inv IRef≃PIx p0 I1 inv IRef≃PIx p1 I2 inv IRef≃PIx p2 I3 inv IRef≃PIx p3 I4 sec IRef≃PIx I1 refl sec IRef≃PIx I2 refl sec IRef≃PIx I3 refl sec IRef≃PIx I4 refl ret IRef≃PIx p0 refl ret IRef≃PIx p1 refl ret IRef≃PIx p2 refl ret IRef≃PIx p3 refl FRef≃FIx : Iso FRef FIx fun FRef≃FIx F1 f0 fun FRef≃FIx F2 f1 inv FRef≃FIx f0 F1 inv FRef≃FIx f1 F2 sec FRef≃FIx F1 refl sec FRef≃FIx F2 refl ret FRef≃FIx f0 refl ret FRef≃FIx f1 refl Free≃Canon : Iso (Dataset IRef FRef ExtURI) (Dataset PIx FIx ExtURI) Free≃Canon DatasetIso IRef≃PIx FRef≃FIx idIso Free≡Canon : Dataset IRef FRef ExtURI ≡ Dataset PIx FIx ExtURI Free≡Canon isoToPath Free≃Canon Canon≃Free : Iso (Dataset PIx FIx ExtURI) (Dataset IRef FRef ExtURI) Canon≃Free invIso (pathToIso Free≡Canon)#Practical GEDCOM use
This is your “reification” step:
- The free tree uses whatever ID space you get from parsing (
@I1@, GUIDs, etc). - Canonical serial uses tight indexing (
Fin n/arrays/vectors) so serialization and algorithms are stable and total.
Because this is anIso, you can:
- transport proofs about the semantics across the reindexing,
- and still run conversions efficiently.
#Recipe D — The Structure→Property→Equality→Structure loop for GEDCOM “well-formedness”
#State of affairs (GEDCOM grounded)
The spec expects that links are consistent:
FAM.HUSB/WIFEpointers should be matched byINDI.FAMSbacklinks.FAM.CHILpointers should be matched byINDI.FAMCbacklinks.The FamilySearch GEDCOM Specifi…
Define apropertyWF(well-formedness) on one representation, then transport it to another using the type equality obtained viaisoToPath.
Here’s a lightweight sketch (membership proofs omitted for brevity; in real code you’d define_∈_and lookup predicates):
agda Copy code WF : ∀ {P F Ext} → Dataset P F Ext → Type WF d Unit -- replace with backlink consistency constraints per GEDCOM -- Transport WF across a representation equality (the key move): WF-transport : ∀ {A B : Type} (i : Iso A B) (a : A) → WF a → WF (transport (isoToPath i) a) WF-transport i a w transport (cong WF (isoToPath i)) wNow connect the transported value to the concrete conversion function using
transportIsoToPath:agda Copy code WF-on-fun : ∀ {A B : Type} (i : Iso A B) (a : A) → WF a → WF (fun i a) WF-on-fun i a w subst WF (transportIsoToPath i a) (WF-transport i a w)This is themechanicalcompletion of the loop:
- Structure conversion:
i : Iso A B - Equality:
isoToPath i : A ≡ B - Transport the property:
transport (cong WF (isoToPath i)) - Return to computable structure: rewrite via
transportIsoToPathso you end up withWF (fun i a).
The bridging lemma is from the Cubical library.agda-libraries
#Recipe E — Alternative GEDCOM modeling choices become isomorphic canonical models
#State of affairs (GEDCOM grounded)
GEDCOM notes that a couple separating and reuniting can be represented either:
- as multiple events in one
FAM, or - as multiple
FAMrecords with the same spouses, each with one marriage event.The FamilySearch GEDCOM Specifi…
This is a classic “many surface forms → one canonical semantic graph” situation.
#How to use`pathToIso`
Define two surface types:
OneFammodel: one family record withevents : List FamEventManyFammodel: list of family records each with oneMARRevent
Then:
- define a canonical semantic type
Episodes(list of marriage episodes), - show
Iso OneFam Episodes, - show
Iso ManyFam Episodes, - compose to get
Iso OneFam ManyFam, - turn it into equality via
isoToPath, - and use
pathToIsowhenever you only have the equality later (e.g., after rewriting).
Sketch:
agda Copy code record OneFam (P : Type) : Type where field partners : Ptr P × Ptr P events : List FamEvent record ManyFam (P : Type) : Type where field partners : Ptr P × Ptr P episodes : List (Maybe Date × Maybe Place) -- a canonical “marriage episodes” view -- You build Iso OneFam ↔ ManyFam by normalizing/denormalizing events ↔ episodes. postulate OneFam≃ManyFam : ∀ {P} → Iso (OneFam P) (ManyFam P) OneFam≡ManyFam : ∀ {P} → OneFam P ≡ ManyFam P OneFam≡ManyFam isoToPath OneFam≃ManyFam OneFam≃ManyFam' : ∀ {P} → Iso (OneFam P) (ManyFam P) OneFam≃ManyFam' pathToIso OneFam≡ManyFam#Practical use
This is how you:
- accept multiple GEDCOM authoring styles on input,
- but keep one canonical internal representation,
- while preserving a proof-relevant, reversible bridge.
#Recipe F — Adoption normalization (PEDI vs ADOP-event)
#State of affairs (GEDCOM grounded)
FAMC.PEDIcan sayADOPTED.The FamilySearch GEDCOM Specifi…ADOPevent +FAMC+ADOPenum can say who adopted (WIFE,HUSB,BOTH).The FamilySearch GEDCOM Specifi…
These are different “views” that you may want to normalize into a single semantic edge/event.
#How`pathToIso`fits
You define two internal representations:
AdoptionA: store adoption asFAMCentry withPEDI ADOPTEDplus maybe aPHRASE.AdoptionB: store adoption as anADOPevent withFAMCandAdopBy.
Then you define a canonical semantic representation, say:
agda Copy code record AdoptionSem (F : Type) : Type where field fam : F by : Maybe AdopBy -- unknown vs wife/husb/bothProve:
Iso AdoptionA AdoptionSemIso AdoptionB AdoptionSem- compose to
Iso AdoptionA AdoptionB - and then you can use
isoToPath/pathToIso.
This is particularly valuable when your “free tree” representation preserves verbatim GEDCOM structure, but your “semantic graph” wants uniform edges.
#Putting the whole pipeline together (your exact loop)
You can now assemble your intended pipeline:
- Free tree (GEDCOM-ish)
Dataset XRefINDI XRefFAM ExtTagName(close to parsed GEDCOM record forest; includes schema tag spellings) - Canonical serial
Dataset (Fin nI) (Fin nF) URIKey(IDs reindexed; extensions normalized to URIs; stable for serialization) - Semantic graph
A structure derived from the canonical dataset:
- spouse/partner edges
- parent-child edges
- adoption/step pedigree semantics
- association edges (e.g., godparent)
Every arrow above can be made anIsoif you restrict to well-formed/canonical subsetsand keep enough information for reification (e.g. keep schema mappings and index assignments explicitly).
Then you get the complete “univalence/transport loop”:
- Structure ↔ Structure:
i : Iso Free Canon - Equality:
p : Free ≡ Canon isoToPath i - Property transport: if
Q : Canon → Type, thentransport (cong Q p)moves proofs/data across - Recover conversions from equality:
pathToIso p : Iso Free Canon
The “computational guarantee” that transport aligns with your functions is provided bytransportIsoToPath.agda-libraries
#Alternative approach
#Use`Equiv`/`ua`(univalence) instead of`Iso`, and reserve`Iso`for computational extraction
If your equivalences are naturallyEquiv(because you want univalence-friendly composition and “proof relevance”), use:
pathToEquiv : A ≡ B → A ≃ Band univalenceua : A ≃ B → A ≡ B(Cubical foundations).agda-librariesagda-libraries- Then convert
EquivtoIsoonly at the boundary where you need concretefun/inv.
This tends to scale better when your semantic graph layer introduces truncations/quotients (e.g., quotienting by “alias” records or by “same person” merges).
#Use SIP (Structure Identity Principle) to turn “structure-preserving renaming” directly into paths
If you model “genealogy with structure” asTypeWithStrand prove “renaming preserves structure”, SIP yields a Path between structured types, which you can then feed intopathToIso. The Cubical SIP patterns are explicitly used to derive paths between different implementations of thesame abstract structure.agda-libraries
In genealogy terms, SIP is a clean way to say:
“Reindexing people IDs is not just an Iso of carriers; it’s an equality of genealogical structures.”
#Related GitHub repositories and references
- Cubical Agda library(the source of
Iso,isoToPath,pathToIso,transport, univalence, etc.).GitHub - Agda (compiler / language), relevant for running Cubical Agda and the
--cubicalmode.Agda Documentation - FamilySearch GEDCOMresources / specification hosting (and related repos under FamilySearch).GitHub+1
- agda-categories(useful if you decide to model semantic graphs categorically and use categorical
pathToIsopatterns).GitHub - FamilySearch GEDCOM tooling / ecosystems(Java / GEDCOM-related codebases you may want for interoperability tests).GitHub+1
If you tell me which exact “semantic graph” shape you want (e.g.,relations as sets vs adjacency lists vs indexed families), I can refactor the coreDatasetrecord so theIndiRecIso-sec/FamRecIso-secproofs become fully explicit (no postulates) while staying GEDCOM-faithful and keeping the samepathToIsorecipes.