Abstract
The number of scientific PDF documents is increasing at a very rapid pace. The searching for these documents is becoming a time consuming task, due to the large number of PDF documents. To make the search and storage more efficient, we need a mechanism to extract metadata from these documents and store this metadata according to their semantics. Extracting information from metadata and storing that information is very time consuming task and requires lots of human effort if performed manually due to large numbers of documents and their varying formats. In this paper, we present a rule-based approach to extract metadata information from the research articles. This approach was developed and evaluated on a diverse data-set provided by ESWC (2016) having a number of different formats and features. Evaluation results show that our proposed approach performs 22% better than CERMINE and 9% better than GROBID.
Original language | English |
---|---|
Title of host publication | CITISIA 2020 IEEE Conference on Innovative Technologies in Intelligent System and Industrial Application |
Subtitle of host publication | Conference Proceedings 25-27 November 2020, Sydney, Australia |
Place of Publication | United States |
Publisher | IEEE |
Pages | 1-4 |
Number of pages | 4 |
ISBN (Electronic) | 9781728194363 |
ISBN (Print) | 9781728194370 |
DOIs | |
Publication status | Published - 25 Nov 2020 |
Event | 5th IEEE International Conference on Innovative Technologies in Intelligent Systems and Industrial Applications, CITISIA 2020: CITISIA 2020 - Charles Sturt University Sydney campus, Sydney, Australia Duration: 25 Nov 2020 → 27 Nov 2020 https://web.archive.org/web/20201128085551/https://ieee-citisia.org/ (Conference website) https://web.archive.org/web/20210124015105/https://ieee-citisia.org/wp-content/uploads/2020/11/Conference-Program-new1.pdf (Conference program) https://ieeexplore.ieee.org/xpl/conhome/9371766/proceeding?pageNumber=4 (Full paper proceedings) |
Publication series
Name | CITISIA 2020 - IEEE Conference on Innovative Technologies in Intelligent Systems and Industrial Applications, Proceedings |
---|
Conference
Conference | 5th IEEE International Conference on Innovative Technologies in Intelligent Systems and Industrial Applications, CITISIA 2020 |
---|---|
Country/Territory | Australia |
City | Sydney |
Period | 25/11/20 → 27/11/20 |
Other | The “Conference on Innovative Technologies in Intelligent Systems & Industrial Applications” (CITISIA) is a student conference that aims to provide students of higher learning institutions with a platform for presenting their own projects. It is also a measure of recognition of students’ professional and technical achievements – by industries and international organizations such as IEEE. This conference is designed to facilitate exchanges of ideas through communication, networking and learning from others, for students and IEEE Chapters in terms of greater collaboration. |
Internet address |
|