Skip to content
wiki.fftac.org

Directory Of Osint And Hacktivist Organizations - Source Excerpt 04 - Taxonomy, metadata, and source model

Back to Directory Of Osint And Hacktivist Organizations

Summary

This source excerpt begins near Taxonomy, metadata, and source model and preserves the surrounding evidence from 2IA.org/agent-file-handoff/Archive/2026-05-18-top-navigation-density-public-copy/Content/Directory of OSINT and Hacktivist Organizations.md.

**Source path:** 2IA.org/agent-file-handoff/Archive/2026-05-18-top-navigation-density-public-copy/Content/Directory of OSINT and Hacktivist Organizations.md

| High | WikiLeaks | `https://wikileaks.org/` | Controversial publishing organization central to debates around leaks, state secrecy, and transparency politics. | Leak / Transparency | Official site root; publish with strict moderation notes |

A crucial editorial rule follows from this list: **“high-priority” does not mean “endorsed.”** In the OSINT and hacktivist-adjacent space, some of the most important entities are also the most controversial. 2IA should therefore differentiate between “important to map” and “safe to normalize.” That distinction is especially important for leaderless banners, leak brands, and historically significant hacker collectives. fileciteturn0file13 fileciteturn0file17

## Taxonomy, metadata, and source model

A definitive directory should not behave like a loose tag cloud. It should separate **institution type**, **topic/method**, and **risk/moderation status**. The strongest organizing principle for 2IA is a two-layer taxonomy: one primary category per entity, then controlled secondary tags for method, issue area, geography, and organizational form. That directly matches the broader OSINT demand signals in the uploaded background materials, which emphasize accountable workflows, verifiable methods, evidence preservation, and discoverability rather than random link accumulation. fileciteturn0file8 fileciteturn0file14

' ' ' mermaid
flowchart TD
    A[2IA Organizations Directory]
    A --> B[OSINT and Investigations]
    A --> C[Public Records and Transparency]
    A --> D[Digital Rights and Civil Liberties]
    A --> E[Secure Tech and Measurement]
    A --> F[Hacker Associations and History]
    A --> G[Leak and Transparency Collectives]

    B --> B1[Investigative newsroom]
    B --> B2[Evidence lab]
    B --> B3[Training community]
    B --> B4[Research nonprofit]

    C --> C1[FOIA platform]
    C --> C2[Open-government guide]
    C --> C3[Archive and records group]

    D --> D1[Privacy advocacy]
    D --> D2[Free-expression advocacy]
    D --> D3[Internet-rights NGO]

    E --> E1[Censorship measurement]
    E --> E2[Anonymity tooling]
    E --> E3[Security research lab]

    F --> F1[Formal hacker association]
    F --> F2[Historic collective]
    F --> F3[Community infrastructure]

    G --> G1[Leak publisher]
    G --> G2[Transparency collective]
' ' ' 

The primary controlled categories should be: `OSINT and Investigations`, `Human Rights Evidence`, `Public Records`, `Digital Rights`, `Privacy`, `Secure Communications`, `Internet Censorship`, `Security Research`, `Hacker Association`, `Hacktivist History`, `Leak / Transparency`, and `Community Infrastructure`. Secondary tags should be tightly governed and finite, such as `geolocation`, `video-verification`, `foia`, `spyware`, `surveillance`, `censorship`, `anonymity`, `public-records`, `open-data`, `cross-border-journalism`, `archive`, `forensics`, `secure-messaging`, and `protest-rights`. The directory should **not** allow uncontrolled free-form tags. That creates clutter, weakens internal search, and produces thin archive pages. fileciteturn0file4 fileciteturn0file14

The metadata layer should be richer than the current site’s issue pages because a directory entry is both a webpage and a dataset row. These are the fields I recommend as the minimum publishable schema.

| Field | Required | Example |
|---|---|---|
| `display_name` | Yes | `Bellingcat` |
| `legal_name` | No | `Bellingcat Ltd.` |
| `slug` | Yes | `bellingcat` |
| `entity_id` | Yes | `2IA-ORG-0001` |
| `official_url` | Yes | `https://www.bellingcat.com/` |
| `canonical_url` | Yes | `https://2ia.org/org/bellingcat/` |
| `organization_type` | Yes | `Investigative newsroom` |
| `primary_category` | Yes | `OSINT and Investigations` |
| `secondary_tags` | Yes | `verification, geolocation, training, investigations` |
| `short_description` | Yes | `Investigative outlet focused on open-source investigations and verification.` |
| `long_description` | No | 2–4 paragraph editorial profile |
| `geographic_scope` | Yes | `Global` |
| `hq_or_base` | No | `Amsterdam, Netherlands` |
| `languages` | No | `English; multilingual resources` |
| `legal_status` | No | `Nonprofit`, `newsroom`, `collective`, `network`, `project` |
| `founded_year` | No | `2014` |
| `contact_root` | No | `Official website root or verified contact page` |
| `sameAs[]` | Yes | Verified GitHub, Mastodon, X, LinkedIn, Wikipedia if appropriate |
| `source_priority` | Yes | `official-site > github > mastodon > x > secondary references` |
| `last_verified` | Yes | `2026-05-17` |
| `verification_status` | Yes | `Verified`, `Review needed`, `Historical only` |
| `confidence_label` | Yes | `Confirmed official homepage` |
| `risk_tier` | Yes | `Green`, `Amber`, `Red` |
| `moderation_note` | Conditional | `Historical/contextual profile; no operational content` |
| `aliases[]` | No | `EFF`, `Electronic Frontier Foundation` |
| `correction_url` | Yes | `https://2ia.org/corrections-and-right-of-reply/` |
| `change_log` | Yes | `Added 2026-05-17; social links rechecked 2026-06-01` |

A strong canonical rule is essential. Every organization should have **one** internal canonical URL, with alias redirects. For acronym-first entities, use the shortest unambiguous slug, such as `/org/eff/`, `/org/epic/`, `/org/icij/`, `/org/occrp/`, and `/org/ooni/`, then 301-redirect long-name variants. For entities better known by full names, use the full hyphenated slug, such as `/org/forensic-architecture/` or `/org/freedom-of-the-press-foundation/`. This is cleaner for users and much better for search than multiple competing internal URLs.

The source-priority model should be explicit and public. OSINT literature and the user-needs materials both stress that no single source is sufficient, and that verifiable workflow matters more than volume. For a directory, that means official identity first, code second, low-volatility social third, volatile social fourth, and methods literature as context rather than identity. fileciteturn0file8 fileciteturn0file14

| Source type | Priority | Representative exact URL | Suggested anchor text | Best use | Main risk | Directory rule |
|---|---|---|---|---|---|---|
| Official site | Highest | `https://www.bellingcat.com/` | `Bellingcat` | Canonical name, mission, official statements, contact entry point | Marketing spin, stale pages | Use as the primary outbound source whenever available |
| GitHub | High | `https://github.com/ooni` | `OONI on GitHub` | Tool ownership, code stewardship, release activity | Mirrors, abandoned repos, unofficial forks | Use only when clearly tied to the official organization |
| Mastodon | Medium-high | `Unspecified until verified from official site` | `{Organization} on Mastodon` | Public updates where the account is clearly verified or linked from the official site | Instance sprawl and impersonation | Use in `sameAs[]`, not as primary identity evidence |
| X / Twitter | Medium | `https://x.com/EFF` | `EFF on X` | Recency, public statements, distribution | Impersonation, deleted content, volatility | Never use as the primary source for the profile description |
| Academic papers / protocols | Medium | `https://www.ohchr.org/sites/default/files/2022-04/OHCHR_BerkeleyProtocol.pdf` | `Berkeley Protocol` | Methods, evidence handling, historical context | Often not current for identity/contact | Use for context and methodology, not canonical identity |

## Seed directory and structured data

The fastest way for 2IA to look definitive is to publish a **seed set of fifty profiles** that cover the main strata of the ecosystem: OSINT newsrooms and evidence labs, public-records infrastructure, digital-rights NGOs, secure-tech and measurement projects, formal hacker associations, community infrastructure, and transparency or leak collectives. The table below pairs each exact suggested link text with the preferred official URL and the recommended internal canonical URL on 2IA. This is the launch spine I would build first.