User:Jason.nlw/National Wikimedian/Wici-Iechyd project summary
Wicipedia is the most viewed Welsh language website in the world with over 100,000 articles and around 750,000 hits a month. A recent audit of the content revealed that Welsh Wicipedia has very few articles about health and yet the few articles which do exist are, on average, being viewed more times than articles on any other subject. This suggests that Welsh speakers want to consume information about their health in Welsh, through Wicipedia.
- 1,500 Welsh language articles on health compared to 84,000 in English
- 2.09% of Welsh Wicipedia articles about Health - 6.67% in English
- Views of Welsh articles about health make up 12% of total page views, more than any other subject.
It is thought that Wikipedia has become the most consulted health resource in the world (based on 4.8 billion pageviews in 2013) and therefore it is vital that it contains reliable, comprehensive information on all aspects of health, from medications, and surgical procedures to fitness, wellbeing and historical information.
It is estimated that poor health costs Wales billions each year, and free easy access to health information through the medium of Welsh (on Wicipedia) would help provide the public with the information they need in a format they are familiar with.
The broad aim of the project was to significantly increase the number of health related articles available on the Welsh Wicipedia, through an outreach program and by developing machine translation techniques, which could be applied in future to increase the quality and quantity of Wicipedia content.
This was a 9 month projects, carried out by National Library of Wales staff, with a £40,000 grant from the Welsh Government.
- Create 3000 new health related articles
- Hold a total of 4 Edit-a-thons in north, mid, south and south west Wales
- Report on how to use machine translation to facilitate the creation of Wicipedia articles in the future, with evidence of a pilot project and links to articles created using this process.
A full list of articles created can be seen here
4699 created (3000 target) 280 articles were created by volunteers and event participants 41 created using existing content secured on an open licence 4,378 created using open data to semi automate content creation
Automated article creation
Articles created using open data make up the bulk of the articles created. These were created by taking data from Wikidata, PubChem and PubMed, translating it into Welsh and using it to build templates for mass producing articles on a given subject.
Articles were created on 3 topics:
1. Human Genome (2715)
These articles about human genes were created using only open data within 1 set text template which was used in all articles. A Welsh language infobox for Genes, which pulls in data and images automatically from Wikidata was created as part of the process.
2. Biographies of health/medical pioneers (776)
These articles include biographical data, such as birth/death date and location, education details and awards won, from Wikidata. But in order to bring a human element to each article, they include a short summary of the persons achievements, mostly taken from information on the English Wicipedia. These summaries were written and translated by hand by the project staff. It was important to include this human element to these biographies in order to place emphasis on the important work of each person - something that could not be replicated with the available data.
3. Medical Drugs (887) These articles include information about important medical drugs, their trade names, their medical use and the conditions they have been used to treat. Most of the information comes from Wikidata, and a Welsh language ‘Drugs’ infobox was developed for use with these articles. The articles also contain a brief summary of each drug, taken from the open medical resource PubChem. These, along with the names of the drugs, were translated professionally.
In addition to the Wikidata infobox, these articles also use Wikidata to produce lists of medical conditions treated, and alternative names for each drug. This means that if a drug is licences under a new name, or is used to tread new conditions, the articles will update automatically, as the data is simply pulled in from Wikidata in real time.
For example, placing this one line of code in each article produces a list of all known conditions treated by the drug - as long as the data is available in Welsh. Template:Wikidata
Articles using openly licensed existing text
An important part of the project involved lobbying existing content producers to share their content on an open license for reuse on Wicipedia. Creating an open access culture within the sector would create a sustainable environment for sharing Welsh language health related content as widely as possible.
We were able to build partnerships with the following content producers;
- WJEC (6)
- Meddwl.org (18)
- British Lung Foundation (17)
These articles contain templates, identifying the source of the content and advising people to view the original article, rather than rely on Wicipedia content, which should only ever be used for guidance.
Articles created by hand
These were articles created by hand as part of the project. Some were created during Edit-a-thon events. Others were created by members of the library volunteer team, members of the online Wici Project and other contributors. We worked with the professional Welsh translation course at Aberystwyth University to translate a number of articles, including this article on the Cornea.
A ‘Wiki Project’ was set up to encourage long standing Wicipedia contributors to engage with the project. The project currently has 10 members.
Some articles considered of high importance were also created by the project team using Wicipedia’s content translation tool to manually translate content from peer reviewed English language articles. These include articles such as cy:Breast Cancer, cy:Morning Sickness and cy:Antibiotic.
Some data was already available in Welsh via Wikidata, but a large number of words and terms were translated in order to produce Welsh language content. All manual and automatic translations were fed beck in to Wikidata, making them available publicly as open data.
A total of 6472 Welsh labels added to existing Wikidata items, including;
- 2714 genome names (Same as English)
- 901 drug names
- 790 personal names
- 569 place names
- 487 medical conditions
- 454 award titles
- 308 university names
- 203 organization names
- 46 medical professions
3 Wici - Iechyd themed edit-a-thons were held as part of this project.
Click the links to see articles created at these events.
Below is a list of Wicipedia user names for people who participated in the Wici-Iechyd Edit-a-thons.
- Gwenno wicicaerdydd
In addition to these edit-a-thons participants were also engaged in the project through the Wicipedia project page, through the NLW volunteer program, the professional translation course at Aberystwyth University, The Cymdeithas Feddygol and through an introduction to Wicipedia session at the Royal College of Nursing, Cardiff on January 15th 2018. Statistics for participants in these activities are included in the overall figures for the project at the end of this report.
The impact of the project can be measured in terms of output and outcomes. Here is a summary of what this project has achieved.
- 4699 new articles
- 6472 Welsh labels added to Wikidata
- 4115 Medical terms translated
- 25,000 views of articles created (08/17 - 03/18)
- 3 public Edit-a-thons
- Wicipedia training for 45 people
- 1 presentation about the project for Cymdeithas Feddygol
- 41 professional articles released on on an open licence
- Collaborations with WJEC, Meddwl.org and British Lung Foundation.