Yard Registry is a structured intelligence pipeline — raw public listing data flows through ingestion, deduplication, entity resolution, enrichment, and scoring layers to produce a clean, scored, actionable business graph across Jamaica's 14 parishes.
Yard Registry currently ingests from 17 institutional Jamaican sources:
12 are named publicly as anchor sources on the landing page, and
5 additional regulators, registries and industry bodies also contribute records.
Every source is checked against robots.txt before any fetch begins. Only six fields
are ever stored per record: name, category, phone, address/parish, website URL, and source
attribution. See the full source index for individual dossiers.
visitjamaica.com licensed-property listings. JSON-LD structured data extracted.Non-negotiable rules: robots.txt is checked before every fetch. 403 or 429 responses cause immediate termination with no retry. Only the six listed fields are ever stored. No full-page scraping, no cookies, no tracking.
Before any record is stored, the raw business name passes through a normalisation pipeline:
lowercasing, punctuation removal, common suffix stripping (Ltd, Jamaica, Co., etc.),
and whitespace collapsing. This produces a normalized_name field used by
all downstream deduplication logic.
Raw listing records map to canonical business entities — the deduplicated, merged representation of a real-world business. Entity resolution uses two strategies:
Pairs scoring ≥ 0.92 similarity are auto-merged. Pairs scoring 0.70–0.92 are queued for
human review. Each canonical entity receives a stable Yard Registry identifier of the form
JBIP-{PARISH_CODE}-{SEQ:06d}
(e.g. JBIP-KGN-000042).
After entity resolution, each entity goes through the enrichment pipeline. Three enrichers run:
social_only=true when only social URLs are found.Enrichment is batched in a background sweep running every 15 minutes. Force-re-enrichment is available via the CLI for entities where signals may have changed.
Each enriched entity receives a digital maturity tier based on its online presence profile:
The opportunity score (0–100) measures the readiness of a business for digital services. Higher scores indicate higher outreach value. Components:
Scores map to tiers:
Scores are recalculated daily at 06:00 JMT by the background scheduler.
Businesses with lat/lon coordinates (from OSM, JTB, or manual entry) are indexed into
H3 hexagonal cells at resolution 9, 7, and 5. Hex-cell aggregates (density, average score,
website adoption rate) are recomputed weekly and exposed via the /geo API endpoints
and the dashboard Map view.
Business data is aggregated from public sources and may contain errors, outdated listings, or duplicates that escaped the deduplication pipeline. Yard Registry data is intended as a starting point for market research and lead generation, not as a definitive source of record. We recommend verifying contact details before outreach.
Businesses wishing to be removed from the registry can email [email protected]. Removal requests are processed within 14 business days.