The Database uses these software to funciton:

  • SparQL for queries
  • Oxigraph for webpage serving
  • Open-Alex snapshots for data

Database

Structure

The database is synced from the open-alex aws snapshot. The data is downloaded as jsonl files, and then converted to the graph database.

Authors

The snapshot json has the following structure

  • id: open alex author accession link

  • orcid: todo

  • display_name: The name of the author

  • display_name_alternatives: list of alternative names

  • works_count: the number of works

  • cited_by_count: number of citations of the author by other users

  • most_cited_work: the name of the most cited work

  • summary_stats: summary of stats

    • 2yr_mean_citedness: avreage citedness over two years
    • h_index: todo
    • i10_index: todo
    • oa_percent: todo
    • works_count: ammount of works
    • cited_by_count: number of citations of the author by other users
    • 2yr_works_count: todo
    • 2yr_cited_by_count: todo
    • 2yr_i10_index: todo
    • 2yr_h_index: todo
  • ids: TODO

  • last_known_institution: The last known institiution

    • id: open alex institution accession link
    • ror: todo
    • display_name: Name of the institution
    • country_code: country code of the institution
    • type: Type of the institution
  • counts_by_year: list of work counts by years.

    • year: year of the work publishment
    • works_count: ammount of works published
    • oa_works_count: todo
    • cited_by_count: todo
  • x_concepts: todo

  • works_api_url: todo

  • updated_date: todo

  • created_date: todo

  • updated: todo

Autogenerated

  • id: string
  • orcid: nothing
  • display_name: string
  • display_name_alternatives: list
  • works_count: int
  • cited_by_count: int
  • most_cited_work: string
  • summary_stats:
    • 2yr_mean_citedness: int
    • h_index: int
    • i10_index: int
    • oa_percent: int
    • works_count: int
    • cited_by_count: int
    • 2yr_works_count: int
    • 2yr_cited_by_count: int
    • 2yr_i10_index: int
    • 2yr_h_index: int
  • ids:
    • openalex: string
  • last_known_institution:
    • id: string
    • ror: string
    • display_name: string
    • country_code: string
    • type: string
  • counts_by_year:
    • year: int
    • works_count: int
    • oa_works_count: int
    • cited_by_count: int
  • x_concepts:
    • id: string
    • wikidata: string
    • display_name: string
    • level: int
    • score: float
  • works_api_url: string
  • updated_date: string
  • created_date: string
  • updated: string

Concepts

Autogenerated

  • id: string
  • wikidata: string
  • display_name: string
  • level: int
  • description: string
  • works_count: int
  • cited_by_count: int
  • summary_stats:
    • 2yr_mean_citedness: int
    • h_index: int
    • i10_index: int
    • oa_percent: int
    • works_count: int
    • cited_by_count: int
    • 2yr_works_count: int
    • 2yr_cited_by_count: int
    • 2yr_i10_index: int
    • 2yr_h_index: int
  • ids:
    • openalex: string
    • wikidata: string
    • wikipedia: string
    • mag: int
  • image_url: nothing
  • image_thumbnail_url: nothing
  • international:
    • display_name:
      • ar: string
      • be: string
      • bn: string
      • ca: string
      • cs: string
      • de: string
      • en: string
      • eo: string
      • es: string
      • et: string
      • fa: string
      • fi: string
      • fr: string
      • he: string
      • hu: string
      • hy: string
      • it: string
      • ja: string
      • kk: string
      • kk-arab: string
      • kk-cyrl: string
      • kk-latn: string
      • ky: string
      • nb: string
      • nl: string
      • oc: string
      • pl: string
      • ru: string
      • sl: string
      • sr: string
      • sv: string
      • ta: string
      • uk: string
      • uz: string
      • vi: string
      • zh: string
      • zh-hans: string
      • zh-hant: string
      • zh-hk: string
    • description:
      • bn: string
      • ca: string
      • de: string
      • en: string
      • fr: string
      • ru: string
      • sr: string
  • ancestors:
    • id: string
    • wikidata: string
    • display_name: string
    • level: int
  • related_concepts:
    • id: string
    • wikidata: nothing
    • display_name: string
    • level: int
    • score: float
  • counts_by_year: list
  • works_api_url: string
  • updated_date: string
  • created_date: string
  • updated: string

Domains

Autogenerated

  • id: string
  • display_name: string
  • description: string
  • display_name_alternatives: list
  • ids:
    • wikidata: string
    • wikipedia: string
  • fields:
    • id: string
    • display_name: string
  • siblings:
    • id: string
    • display_name: string
  • works_count: int
  • cited_by_count: int
  • works_api_url: string
  • updated_date: string
  • created_date: string
  • updated: string

Fields

Autogenerated

  • id: string
  • display_name: string
  • description: string
  • display_name_alternatives: list
  • ids:
    • wikidata: string
    • wikipedia: string
  • domain:
    • id: string
    • display_name: string
  • subfields:
    • id: string
    • display_name: string
  • siblings:
    • id: string
    • display_name: string
  • works_count: int
  • cited_by_count: int
  • works_api_url: string
  • updated_date: string
  • created_date: string
  • updated: string

Funders

Autogenerated

  • id: string
  • display_name: string
  • ids:
    • openalex: string
    • wikidata: string
    • ror: string
    • crossref: int
    • doi: string
  • alternate_titles: list
  • country_code: string
  • description: string
  • homepage_url: string
  • image_url: string
  • image_thumbnail_url: string
  • roles:
    • role: string
    • id: string
    • works_count: int
  • grants_count: int
  • works_count: int
  • cited_by_count: int
  • summary_stats:
    • 2yr_mean_citedness: float
    • h_index: int
    • i10_index: int
    • oa_percent: float
    • works_count: int
    • cited_by_count: int
    • 2yr_works_count: int
    • 2yr_cited_by_count: int
    • 2yr_i10_index: int
    • 2yr_h_index: int
  • counts_by_year:
    • year: int
    • works_count: int
    • oa_works_count: int
    • cited_by_count: int
  • x_concepts:
    • id: string
    • wikidata: string
    • display_name: string
    • level: int
    • score: float
  • updated_date: string
  • created_date: string
  • updated: string

Institutions

Autogenerated

  • id: string
  • ror: string
  • display_name: string
  • country_code: string
  • type: string
  • homepage_url: string
  • image_url: string
  • image_thumbnail_url: string
  • display_name_acronyms: list
  • display_name_alternatives: list
  • works_count: int
  • cited_by_count: int
  • summary_stats:
    • 2yr_mean_citedness: float
    • h_index: int
    • i10_index: int
    • oa_percent: float
    • works_count: int
    • cited_by_count: int
    • 2yr_works_count: int
    • 2yr_cited_by_count: int
    • 2yr_i10_index: int
    • 2yr_h_index: int
  • ids:
    • openalex: string
    • ror: string
    • grid: string
    • wikipedia: string
    • wikidata: string
    • mag: int
  • roles:
    • role: string
    • id: string
    • works_count: int
  • repositories: list
  • geo:
    • city: nothing
    • geonames_city_id: nothing
    • region: nothing
    • country_code: string
    • country: nothing
    • latitude: float
    • longitude: float
  • international:
    • display_name:
      • ar: string
      • azb: string
      • be: string
      • bg: string
      • br: string
      • ca: string
      • cs: string
      • cy: string
      • de: string
      • el: string
      • en: string
      • en-gb: string
      • eo: string
      • es: string
      • et: string
      • eu: string
      • fa: string
      • fr: string
      • gl: string
      • he: string
      • hu: string
      • hy: string
      • id: string
      • it: string
      • ja: string
      • ko: string
      • lb: string
      • nb: string
      • nl: string
      • nn: string
      • pl: string
      • pt: string
      • ro: string
      • ru: string
      • sk: string
      • sv: string
      • tl: string
      • uk: string
      • vi: string
      • zh: string
      • zh-hant: string
      • zh-hk: string
  • associated_institutions:
    • id: string
    • ror: string
    • display_name: string
    • country_code: string
    • type: string
    • relationship: string
  • counts_by_year:
    • year: int
    • works_count: int
    • oa_works_count: int
    • cited_by_count: int
  • x_concepts:
    • id: string
    • wikidata: string
    • display_name: string
    • level: int
    • score: float
  • works_api_url: string
  • updated_date: string
  • created_date: string
  • updated: string

Merged_ids

Publishers

Autogenerated

  • id: string
  • display_name: string
  • ids:
    • openalex: string
  • alternate_titles: list
  • parent_publisher: nothing
  • lineage: list
  • hierarchy_level: int
  • country_codes: list
  • homepage_url: nothing
  • image_url: nothing
  • image_thumbnail_url: nothing
  • roles:
    • role: string
    • id: string
    • works_count: int
  • works_count: int
  • cited_by_count: int
  • sources_count: int
  • summary_stats:
    • 2yr_mean_citedness: int
    • h_index: int
    • i10_index: int
    • oa_percent: int
    • works_count: int
    • cited_by_count: int
    • sources_count: int
    • 2yr_works_count: int
    • 2yr_cited_by_count: int
    • 2yr_i10_index: int
    • 2yr_h_index: int
  • counts_by_year:
    • year: int
    • works_count: int
    • oa_works_count: int
    • cited_by_count: int
  • x_concepts:
    • id: string
    • wikidata: string
    • display_name: string
    • level: int
    • score: float
  • sources_api_url: string
  • updated_date: string
  • created_date: string
  • updated: string

Sources

Autogenerated

  • id: string
  • issn_l: string
  • issn: list
  • display_name: string
  • publisher: string
  • host_organization: string
  • host_organization_name: string
  • host_organization_lineage: list
  • host_organization_lineage_names: list
  • is_oa: bool
  • is_in_doaj: bool
  • host_institution_lineage: list
  • host_institution_lineage_names: list
  • publisher_lineage: list
  • publisher_lineage_names: list
  • publisher_id: string
  • type: string
  • works_count: int
  • cited_by_count: int
  • summary_stats:
    • 2yr_mean_citedness: float
    • h_index: int
    • i10_index: int
    • oa_percent: float
    • works_count: int
    • cited_by_count: int
    • 2yr_works_count: int
    • 2yr_cited_by_count: int
    • 2yr_i10_index: int
    • 2yr_h_index: int
  • alternate_titles: list
  • abbreviated_title: string
  • homepage_url: string
  • country_code: string
  • ids:
    • openalex: string
    • issn_l: string
    • issn: list
    • fatcat: string
    • wikidata: string
  • apc_prices:
    • price: int
    • currency: string
  • apc_usd: int
  • societies: list
  • counts_by_year:
    • year: int
    • works_count: int
    • oa_works_count: int
    • cited_by_count: int
  • x_concepts:
    • id: string
    • wikidata: string
    • display_name: string
    • level: int
    • score: float
  • works_api_url: string
  • updated_date: string
  • created_date: string
  • updated: string

Subfields

Autogenerated

  • id: string
  • display_name: string
  • description: string
  • display_name_alternatives: list
  • ids:
    • wikidata: string
    • wikipedia: string
  • field:
    • id: string
    • display_name: string
  • domain:
    • id: string
    • display_name: string
  • topics:
    • id: string
    • display_name: string
  • siblings:
    • id: string
    • display_name: string
  • works_count: int
  • cited_by_count: int
  • works_api_url: string
  • updated_date: string
  • created_date: string
  • updated: string

Topics

Autogenerated

  • id: string
  • display_name: string
  • subfield:
    • id: string
    • display_name: string
  • field:
    • id: string
    • display_name: string
  • domain:
    • id: string
    • display_name: string
  • description: string
  • keywords: list
  • ids:
    • openalex: string
    • wikipedia: string
  • siblings:
    • id: string
    • display_name: string
  • works_count: int
  • cited_by_count: int
  • works_api_url: string
  • updated_date: string
  • created_date: string
  • updated: string

Works

Autogenerated

  • id: string
  • doi: string
  • doi_registration_agency: string
  • display_name: nothing
  • title: nothing
  • publication_year: int
  • publication_date: string
  • language: nothing
  • ids:
    • openalex: string
    • doi: string
  • primary_location:
    • source: nothing
    • pdf_url: nothing
    • landing_page_url: string
    • is_oa: bool
    • version: nothing
    • license: nothing
    • doi: string
  • best_oa_location: nothing
  • type: string
  • open_access:
    • is_oa: bool
    • oa_status: string
    • oa_url: nothing
    • any_repository_has_fulltext: bool
  • authorships: list
  • corresponding_author_ids: list
  • corresponding_institution_ids: list
  • cited_by_count: int
  • summary_stats:
    • cited_by_count: int
    • 2yr_cited_by_count: int
  • biblio:
    • volume: nothing
    • issue: nothing
    • first_page: nothing
    • last_page: nothing
  • is_retracted: bool
  • is_paratext: bool
  • concepts:
    • id: string
    • wikidata: string
    • display_name: string
    • level: int
    • score: float
  • mesh: list
  • locations_count: int
  • locations:
    • source: nothing
    • pdf_url: nothing
    • landing_page_url: string
    • is_oa: bool
    • version: nothing
    • license: nothing
    • doi: string
  • referenced_works: list
  • referenced_works_count: int
  • sustainable_development_goals: list
  • grants: list
  • apc_list: nothing
  • apc_paid: nothing
  • related_works: list
  • abstract_inverted_index: nothing
  • counts_by_year: list
  • cited_by_api_url: string
  • updated_date: string
  • created_date: string
  • updated: string
  • authors_count: int
  • concepts_count: int

Usage

You can query the database using the SparQL language here

Introduction