Challenge 2 – Literature

Challenge 2 – Extract geographical information

What city, region, country is the prevalence information for?

PubMed example: 
Kalua K, Chirwa T, Kalilani L, Abbenyi S, Mukaka M, Bailey R. Prevalence and risk factors for trachoma in central and southern Malawi. PloS one. 2010, 5(2). 
BACKGROUND: Trachoma, one of the neglected tropical diseases is suspected to be endemic in Malawi. OBJECTIVES: To determine the prevalence of trachoma and associated risk factors in central and southern Malawi. METHODOLOGY/PRINCIPAL FINDINGS: A population based survey conducted in randomly selected clusters in Chikwawa district (population 438,895), southern Malawi and Mchinji district (population 456,558), central Malawi. Children aged 1-9 years and adults aged 15 and above were assessed for clinical signs of trachoma. In total, 1010 households in Chikwawa and 1016 households in Mchinji districts were enumerated within 108 clusters (54 clusters in each district). A total of 6,792 persons were examined for ocular signs of trachoma. The prevalence of trachomatous inflammation, follicular (TF) among children aged 1-9 years was 13.6% (CI 11.6-15.6) in Chikwawa and 21.7% (CI 19.5-23.9) in Mchinji districts respectively. The prevalence of trachoma trichiasis (TT) in women and men aged 15 years and above was 0.6% (CI 0.2-0.9) in Chikwawa and 0.3% (CI 0.04-0.6) in Mchinji respectively. The presence of a dirty face was significantly associated with trachoma follicular (TF) in both Chikwawa and Mchinji districts (P0.001). CONCLUSION/SIGNIFICANCE: Prevalence rates of trachoma follicles (TF) in Central and Southern Malawi exceeds the WHO guidelines for the intervention with mass antibiotic distribution (TF>10%), and warrants the trachoma SAFE control strategy to be undertaken in Chikwawa and Mchinji districts.

Country: Malawi
Region: 
City/Town: Chikwawa, Mchinji


Algorithmic Approach:

  • Text search of title and abstract for item (i.e. country, state, city/town) 

Project Status:

  • Java program where the PubMed references are stored in a lucene database and a text search occurs for country names and then tags the reference with the country associated.

Challenges:

  • Common town/city names (i.e. Paris, France; Paris, Texas)

Datasets:

  • Country List with country names (i.e. Canada) and ISO codes (i.e. CA)
  • State/Province Lists [to be assembled]
  • City/Town Lists [to be assembled]