From: Nathan TeBlunthuis Date: Tue, 31 Mar 2020 22:25:51 +0000 (-0700) Subject: Improve README.md for keywords X-Git-Url: https://code.communitydata.science/covid19.git/commitdiff_plain/4fd516a700501e0b934ddc7d90c27dfa5b87e7f4?ds=sidebyside;hp=98b07b8098611287eaa775b09622d1f3514303c8 Improve README.md for keywords --- diff --git a/keywords/README.md b/keywords/README.md index 18df219..5bf27ba 100644 --- a/keywords/README.md +++ b/keywords/README.md @@ -1,3 +1,7 @@ -# Transliterations +# Keywords -This part of the project collects tranliterations of key phrases related to COVID-19 using Wikidata. We search the Wikidata API for entities in `src/wikidata_search.py` and then we make simple SPARQL queries in `src/wikidata_transliterations.py` to collect labels and aliases the entities. The labels come with language metadata. This seems to provide a decent initial list of relevant terms across multiple languages. +This code finds trending web searches related to the COVID-19 pandemic using Google trends (`collect_trends.py`). It then searches for relevant keywords on Wikidata (`wikidata_search`) in order to find high-quality translations of important words and phrases (`wikidata_translations.py`). The goal is to support efforts expanding the Observatory to information in many languages beyond English. + +We search the Wikidata API for entities in `src/wikidata_search.py` and then we make simple SPARQL queries in `src/wikidata_translations.py` to collect labels and aliases the entities. The labels come with language metadata. This seems to provide a decent initial list of relevant terms across multiple languages. + +The output data lives at [covid19.communitydata.science](https://covid19.communitydata.science/datasets/keywords).