.. code-block:: xml 2019-12-17 2021-11-24 Person Person_Surname Person_Name http://w3id.org/meta-share/meta-share/CLARIN-SHARE Person Person_Surname Person_Name Αποθετήριο ΕΚ ΑΘΗΝΑ ATHENA RC Repository http://inventory.clarin.gr LanguageResource Πρακτικά της Ολομέλειας του Ελληνικού Κοινοβουλίου (1989-2019) Greek Parliament Plenary Sessions (1989-2019) Σώμα κειμένων με τα πρακτικά των συνεδριάσεων της Ολομέλειας του Ελληνικού Κοινοβουλίου τα τελευταία 30 χρόνια (περισσότερες από 1.000.000 ομιλίες). Το παρόν σώμα κειμένων περιλαμβάνει όλα τα πρακτικά σε μορφή txt. Για ευκολότερη επεξεργασία έχουν δημιουργηθεί και μικρότερα υποσύνολα, με ανώτατο μέγεθος συμπιεσμένων αρχείων τα 40 Mb ανά υποσύνολο, που ανταποκρίνονται στις εκάστοτε βουλευτικές περιόδους. This is a collection of the raw minutes of the Greek Parliament plenary sessions of the last 30 years (more than 1.000.000 speeches). The existing corpus has all raw data in txt format. In order to make the resource more processable, we have also split it into smaller subcorpora, with a maximum compressed folder size of 40 Mb per subcorpus. The created subcorpora are thematically organized per Greek parliamentary terms. http://hdl.handle.net/11500/ATHENA-0000-0000-5D62-A 1.0.0 (automatically assigned) person@ilsp.gr Person Person_Surname Person_Name Πρακτικά της Ολομέλειας του Ελληνικού Κοινοβουλίου (1989-2019) (2019). Version 1.0.0 (automatically assigned). [Dataset (Text corpus)]. CLARIN:EL. http://hdl.handle.net/11500/ATHENA-0000-0000-5D62-A Greek Parliament Plenary Sessions (1989-2019) (2019). Version 1.0.0 (automatically assigned). [Dataset (Text corpus)]. CLARIN:EL. http://hdl.handle.net/11500/ATHENA-0000-0000-5D62-A monolingual, political science Organization Ινστιτούτο Επεξεργασίας του Λόγου Institute for Language and Speech Processing http://www.ilsp.gr/ Corpus http://w3id.org/meta-share/meta-share/rawCorpus CorpusTextPart http://w3id.org/meta-share/meta-share/text http://w3id.org/meta-share/meta-share/monolingual el el http://w3id.org/meta-share/meta-share/downloadable http://www.hiddenLocation.org 5079.0 http://w3id.org/meta-share/meta-share/file http://w3id.org/meta-share/omtd-share/Text http://w3id.org/meta-share/meta-share/UTF-8 Creative Commons Attribution 4.0 International https://creativecommons.org/licenses/by/4.0/legalcode https://creativecommons.org/licenses/by/4.0/ http://w3id.org/meta-share/meta-share/attribution http://w3id.org/meta-share/meta-share/allowsDirectAccess http://w3id.org/meta-share/meta-share/allowsProcessing http://w3id.org/meta-share/meta-share/public CC-BY-4.0 Πρακτικά της Ολομέλειας του Ελληνικού Κοινοβουλίου (1989-2019). Άδεια: Creative Commons Attribution 4.0 International (https://creativecommons.org/licenses/by/4.0/legalcode, https://creativecommons.org/licenses/by/4.0/). Πηγή: http://hdl.handle.net/11500/ATHENA-0000-0000-5D62-A (CLARIN:EL) Greek Parliament Plenary Sessions (1989-2019) used under Creative Commons Attribution 4.0 International (https://creativecommons.org/licenses/by/4.0/legalcode, https://creativecommons.org/licenses/by/4.0/). Source: http://hdl.handle.net/11500/ATHENA-0000-0000-5D62-A (CLARIN:EL) false false .. _GoldenCorpus: Monolingual corpus #2 ======================= .. raw:: html

.. code-block:: xml 2021-02-25 2021-05-28 Person Person_Surname Person_Name http://w3id.org/meta-share/meta-share/CLARIN-SHARE Person Person_Surname Person_Name Αποθετήριο ΕΚ ΑΘΗΝΑ ATHENA RC Repository http://inventory.clarin.gr LanguageResource Golden Part of Speech Tagged Corpus Το Golden Part of Speech Tagged Corpus είναι ένα σύνολο δεδομένων που αποτελείται από κείμενα επιλεγμένα από το σύνολο των κειμένων του ΕΘΕΓ, και το οποίο είναι μεγέθους 100.000 λέξεων. Τα κείμενα συλλέχθηκαν από το διαδίκτυο με τεχνικές διάσχισης σημασιολογικού ιστού (web crawling), με τα εξής κριτήρια: κείμενα αποκλειστικά με ελεύθερη διάθεση (CC0 4.0) ή με αναφορά δημιουργού (CC BY 4.0) και ευρεία ποικιλία πηγών και θεμάτων. Η διαδικασία που ακολουθήθηκε για την υλοποίηση αυτού του σώματος δεδομένων περιλαμβάνει τα ακόλουθα βήματα: - Καθαρισμό από άχρηστα στοιχεία (boilerplate material) - Διόρθωση ορθογραφικών λαθών με το χέρι - Αυτόματη μορφοσυντακτική επισημείωση με χρήση του αυτόματου μορφολογικού επισημειωτή του ΙΕΛ/ΕΚ ΑΘΗΝΑ (ILSP Feature-based multi-tiered POS Tagger), με την οποία για κάθε λέξη του σώματος δεδομένων αποδόθηκε ο γραμματικός της χαρακτηρισμός και το λήμμα στο οποίο ανήκει - Έλεγχος και διόρθωση των αποτελεσμάτων της αυτόματης επισημείωσης, επίσης με το χέρι. Το Golden Corpus XML διατίθεται σε μορφή XML, ως αρχείο το οποίο περιλαμβάνει όλα τα κείμενα. Η δομή κάθε κειμένου περιλαμβάνει αρχικά ορισμένα μεταδεδομένα για το κείμενο και μετά το ίδιο το κείμενο, που είναι δομημένο σε επισημειωμένες παραγράφους, προτάσεις και λέξεις. Κάθε λέξη συνοδεύεται από πληροφορίες για το λήμμα της, τον γραμματικό χαρακτηρισμό της και την θέση της μέσα στην πρόταση με βάση τους χαρακτήρες έναρξης και λήξης. Επιλέχθηκε ως μορφή διάθεσης η XML ώστε το σύνολο δεδομένων να μπορεί να χρησιμοποιηθεί σε διαφορετικά περιβάλλοντα, ανεξαρτήτως λειτουργικού συστήματος. The Golden Part-of-Speech Tagged Corpus is a subset of the Hellenic National Corpus (HNC), the size of which is 100.000 words; it consists of selected texts from a variety of sources covering various domains. These texts have been crawled from the web and are licensed under either CC0 4.0 or CC BY 4.0. The corpus underwent the following stages: • cleaning and removal of boilerplate material, • manual correction of typos and spelling mistakes, • automatic lemmatization and part-of-speech tagging for each word, using the ILSP Feature-based multi-tiered POS Tagger, and • manual correction of the ILSP Feature-based multi-tiered POS Tagger results. The Golden Part of Speech Tagged Corpus is available as a single XML file, containing all texts in the following structure: first, some metadata about the text and then the text itself with annotation at the level of paragraphs, sentences and words. Each word comes with information on its lemma, POS and its boundaries (beginning and end). XML was chosen as the most appropriate format as it can be used in various environments, regardless of the operating system in use. http://hdl.handle.net/11500/ATHENA-0000-0000-5E7D-C 1 person@ilsp.athena-innovation.gr Person Person_Surname Person_Name Ινστιτούτο Επεξεργασίας του Λόγου - Ερευνητικό Κέντρο Αθηνά (2021). Golden Part of Speech Tagged Corpus. Version 1. [Dataset (Text corpus)]. CLARIN:EL. http://hdl.handle.net/11500/ATHENA-0000-0000-5E7D-C Institute for Language and Speech Processing - Athena Research Center (2021). Golden Part of Speech Tagged Corpus. Version 1. [Dataset (Text corpus)]. CLARIN:EL. http://hdl.handle.net/11500/ATHENA-0000-0000-5E7D-C monolingual morphosyntacticAnnotation-posTagging Organization Ινστιτούτο Επεξεργασίας του Λόγου Institute for Language and Speech Processing http://www.ilsp.gr/ Ελληνικός Θησαυρός της Ελληνικής Γλώσσας Hellenic National Corpus http://hdl.handle.net/11500/ATHENA-0000-0000-23E2-9 3.0 Corpus http://w3id.org/meta-share/meta-share/annotatedCorpus CorpusTextPart http://w3id.org/meta-share/meta-share/text http://w3id.org/meta-share/meta-share/monolingual el el http://w3id.org/meta-share/omtd-share/PartOfSpeech ILSP Feature-based multi-tiered POS Tagger http://hdl.handle.net/11500/ATHENA-0000-0000-23E8-3 1 http://w3id.org/meta-share/omtd-share/StructuralAnnotationType http://w3id.org/meta-share/meta-share/sentence http://w3id.org/meta-share/meta-share/token1 http://w3id.org/meta-share/omtd-share/Lemma http://w3id.org/meta-share/meta-share/downloadable http://www.hiddenLocation.org http://fixme.com 100000.0 http://w3id.org/meta-share/meta-share/word3 http://w3id.org/meta-share/omtd-share/Xml http://w3id.org/meta-share/meta-share/UTF-8 Creative Commons Attribution 4.0 International https://creativecommons.org/licenses/by/4.0/legalcode https://creativecommons.org/licenses/by/4.0/ http://w3id.org/meta-share/meta-share/attribution http://w3id.org/meta-share/meta-share/allowsDirectAccess http://w3id.org/meta-share/meta-share/allowsProcessing http://w3id.org/meta-share/meta-share/public CC-BY-4.0 Golden Part of Speech Tagged Corpus. Δημιουργός: Ινστιτούτο Επεξεργασίας του Λόγου - Ερευνητικό Κέντρο Αθηνά. Άδεια: Creative Commons Attribution 4.0 International (https://creativecommons.org/licenses/by/4.0/legalcode, https://creativecommons.org/licenses/by/4.0/). Πηγή: http://hdl.handle.net/11500/ATHENA-0000-0000-5E7D-C (CLARIN:EL) Golden Part of Speech Tagged Corpus by Institute for Language and Speech Processing - Athena Research Center used under Creative Commons Attribution 4.0 International (https://creativecommons.org/licenses/by/4.0/legalcode, https://creativecommons.org/licenses/by/4.0/). Source: http://hdl.handle.net/11500/ATHENA-0000-0000-5E7D-C (CLARIN:EL) false false .. _BulTM: Bilingual corpus ================== .. raw:: html

.. code-block:: xml 2015-09-11 2021-05-28 Person Person_Surname Person_Name http://w3id.org/meta-share/meta-share/CLARIN-SHARE Person Person_Surname Person_Name Αποθετήριο ΕΚ ΑΘΗΝΑ ATHENA RC Repository http://inventory.clarin.gr LanguageResource Ελληνο-Βουλγαρικό κειμενικό σώμα Bul-TM Greek-Bulgarian Bul-TM parallel corpus Bul-TM Δίγλωσσο σώμα παράλληλων προτάσεων (βουλγαρικά – ελληνικά) από κείμενα γενικής γλώσσας που έχουν αντληθεί από το διαδίκτυο. Το σώμα διατίθεται με τη μορφή TMX (ως στοιχισμένες προτάσεις). Parallel bilingual corpus (web documents, general domain) aligned at sentence level; the corpus is available in TMX format. http://hdl.handle.net/11500/ATHENA-0000-0000-23E4-7 1.0.0 (automatically assigned) person@ilsp.gr person@ilsp.gr Person Person_Surname Person_Name Person Person_Surname Person_Name Ινστιτούτο Επεξεργασίας του Λόγου - Ερευνητικό Κέντρο Αθηνά (2015). Ελληνο-Βουλγαρικό κειμενικό σώμα Bul-TM. Version 1.0.0 (automatically assigned). [Dataset (Text corpus)]. CLARIN:EL. http://hdl.handle.net/11500/ATHENA-0000-0000-23E4-7 Institute for Language and Speech Processing - Athena Research Center (2015). Greek-Bulgarian Bul-TM parallel corpus. Version 1.0.0 (automatically assigned). [Dataset (Text corpus)]. CLARIN:EL. http://hdl.handle.net/11500/ATHENA-0000-0000-23E4-7 bilingual parallel alignment writtenLanguage Political Science DDC320 society Organization Ινστιτούτο Επεξεργασίας του Λόγου Institute for Language and Speech Processing http://www.ilsp.gr/ 2005-10-01 2007-09-30 Development of a Bulgarian to Greek and Greek to Bulgarian Translation Memory workbench https://www.ilsp.gr/projects/bultm/ http://w3id.org/meta-share/meta-share/nationalFunds Organization Ministry of Economy and Finances http://w3id.org/meta-share/omtd-share/LanguageTechnology http://w3id.org/meta-share/omtd-share/MachineTranslation http://w3id.org/meta-share/omtd-share/MachineTranslation http://w3id.org/meta-share/omtd-share/LanguageTechnology Π2.1-Σώματα_Κειμένων-v1 Deliverable 2.1 - Text corpora-v1 Corpus http://w3id.org/meta-share/meta-share/annotatedCorpus CorpusTextPart http://w3id.org/meta-share/meta-share/text http://w3id.org/meta-share/meta-share/bilingual http://w3id.org/meta-share/meta-share/parallel el el bg bg http://w3id.org/meta-share/meta-share/writtenLanguage http://w3id.org/meta-share/omtd-share/Alignment1 http://w3id.org/meta-share/meta-share/sentence false http://w3id.org/meta-share/meta-share/automatic TrAid unspecified The JRC-Acquis Corpus, version 3.0 http://hdl.handle.net/11500/ATHENA-0000-0000-25C9-4 1.0.0 (automatically assigned) SETIMES - A parallel corpus of the Balkan languages http://hdl.handle.net/11500/ATHENA-0000-0000-2591-2 1 original source: EU texts http://w3id.org/meta-share/meta-share/downloadable http://www.hiddenLocation.org http://fixme.com 10000000.0 http://w3id.org/meta-share/meta-share/token http://w3id.org/meta-share/omtd-share/Tmx http://w3id.org/meta-share/meta-share/UTF-8 Creative Commons Attribution 4.0 International https://creativecommons.org/licenses/by/4.0/legalcode https://creativecommons.org/licenses/by/4.0/ http://w3id.org/meta-share/meta-share/attribution http://w3id.org/meta-share/meta-share/allowsDirectAccess http://w3id.org/meta-share/meta-share/allowsProcessing http://w3id.org/meta-share/meta-share/public CC-BY-4.0 Ελληνο-Βουλγαρικό κειμενικό σώμα Bul-TM. Δημιουργός: Ινστιτούτο Επεξεργασίας του Λόγου - Ερευνητικό Κέντρο Αθηνά. Άδεια: Creative Commons Attribution 4.0 International (https://creativecommons.org/licenses/by/4.0/legalcode, https://creativecommons.org/licenses/by/4.0/). Πηγή: http://hdl.handle.net/11500/ATHENA-0000-0000-23E4-7 (CLARIN:EL) Greek-Bulgarian Bul-TM parallel corpus by Institute for Language and Speech Processing - Athena Research Center used under Creative Commons Attribution 4.0 International (https://creativecommons.org/licenses/by/4.0/legalcode, https://creativecommons.org/licenses/by/4.0/). Source: http://hdl.handle.net/11500/ATHENA-0000-0000-23E4-7 (CLARIN:EL) false false .. _DictaSign: Multilingual corpus ====================== .. raw:: html

.. code-block:: xml 2015-12-23 2021-05-28 Person Person_Surname Person_Name http://w3id.org/meta-share/meta-share/CLARIN-SHARE Person Person_Surname Person_Name Αποθετήριο ΕΚ ΑΘΗΝΑ ATHENA RC Repository http://inventory.clarin.gr LanguageResource Σώμα DICTA-SIGN DICTA-SIGN corpus Σώμα πολυμεσικών δεδομένων (βίντεο) για τέσσερις νοηματικές γλώσσες (ελληνική, αγγλική, γαλλική και γερμανική). Περιλαμβάνονται καταγραφές από τουλάχιστον 14 πληροφορητές για κάθε γλώσσα σε τουλάχιστον δίωρες συνεδρίες, με βάση κοινά σενάρια και καθήκοντα για όλες τις γλώσσες. Τα δεδομένα είναι εν μέρει επισημειωμένα. Το σώμα δεδομένων διατίθεται σε ιστότοπο που έχει δημιουργηθεί και συντηρείται από το Πανεπιστήμιο του Αμβούργου και είναι προσβάσιμος <a href="http://www.sign-lang.uni-hamburg.de/dicta-sign/portal/index.html" target="_blank">εδώ</a>. Multimedia corpus (video) for four sign languages (english, french, german and greek) of at least 14 informants per language and a session duration of approx. 2 hours using the same elicitation materials (scripts and tasks) across languages. The data is partially annotated. The corpus is available through a dedicated website, created and maintained by the University of Hamburg accessible <a href="http://www.sign-lang.uni-hamburg.de/dicta-sign/portal/index.html" target="_blank">here</a>. http://hdl.handle.net/11500/ATHENA-0000-0000-28C5-5 1.0.0 (automatically assigned) http://www.sign-lang.uni-hamburg.de/dicta-sign/portal/index.html Person Person_Surname Person_Name Person Person_Surname Person_Name School of Computing Sciences - University of East Anglia; Research Institute of Computer Science - Paul Sabatier University; Institute of German Sign Language and Communication of the Deaf - University of Hamburg (2015). Σώμα DICTA-SIGN. Version 1.0.0 (automatically assigned). [Dataset (Text and Video corpus)]. CLARIN:EL. http://hdl.handle.net/11500/ATHENA-0000-0000-28C5-5 School of Computing Sciences - University of East Anglia; Research Institute of Computer Science - Paul Sabatier University; Institute of German Sign Language and Communication of the Deaf - University of Hamburg (2015). DICTA-SIGN corpus. Version 1.0.0 (automatically assigned). [Dataset (Text and Video corpus)]. CLARIN:EL. http://hdl.handle.net/11500/ATHENA-0000-0000-28C5-5 multilingual parallel scripts signLanguage Geography & Travel DDC910 Organization School of Computing Sciences https://www.uea.ac.uk/about/school-of-computing-sciences Organization Research Institute of Computer Science Institut de Recherche en Informatique de Toulouse http://www.irit.fr/ Organization Institute of German Sign Language and Communication of the Deaf https://www.idgs.uni-hamburg.de/en.html 2009-02-01 2012-01-31 Sign Language Recognition, Generation and Modelling with application in Deaf Communication http://www.dictasign.eu/ http://w3id.org/meta-share/meta-share/euFunds Organization Ευρωπαϊκή Επιτροπή European Commission https://ec.europa.eu/info/index_en http://w3id.org/meta-share/omtd-share/LanguageTechnology http://w3id.org/meta-share/omtd-share/LanguageTechnology Corpus http://w3id.org/meta-share/meta-share/rawCorpus CorpusTextPart http://w3id.org/meta-share/meta-share/text http://w3id.org/meta-share/meta-share/multilingual http://w3id.org/meta-share/meta-share/unspecified fr fr el el de de en en http://w3id.org/meta-share/meta-share/signLanguage scripts http://w3id.org/meta-share/meta-share/video scripts of the tasks for the video false CorpusVideoPart http://w3id.org/meta-share/meta-share/video http://w3id.org/meta-share/meta-share/multilingual http://w3id.org/meta-share/meta-share/parallel gss gss bfi bfi gsg gsg fsl fsl http://w3id.org/meta-share/meta-share/signLanguage natural signers http://w3id.org/meta-share/meta-share/none1 http://w3id.org/meta-share/meta-share/elicited http://w3id.org/meta-share/meta-share/dialogue http://w3id.org/meta-share/meta-share/rolePlay http://w3id.org/meta-share/meta-share/none3 http://w3id.org/meta-share/meta-share/interactive1 http://w3id.org/meta-share/meta-share/text scripts of the tasks for the video false http://w3id.org/meta-share/meta-share/accessibleThroughInterface http://www.sign-lang.uni-hamburg.de/dicta-sign/portal/index.html 10.0 http://w3id.org/meta-share/meta-share/file http://w3id.org/meta-share/meta-share/unspecified 25.0 http://w3id.org/meta-share/meta-share/hour1 http://w3id.org/meta-share/omtd-share/mp4 Creative Commons Attribution Non Commercial No Derivatives 4.0 International https://creativecommons.org/licenses/by-nc-nd/4.0/legalcode https://creativecommons.org/licenses/by-nc-nd/4.0/ http://w3id.org/meta-share/meta-share/attribution http://w3id.org/meta-share/meta-share/nonCommercialUse http://w3id.org/meta-share/meta-share/noDerivatives http://w3id.org/meta-share/meta-share/allowsDirectAccess http://w3id.org/meta-share/meta-share/public CC-BY-NC-ND-4.0 Σώμα DICTA-SIGN. Δημιουργός: School of Computing Sciences - University of East Anglia, Research Institute of Computer Science - Paul Sabatier University and Institute of German Sign Language and Communication of the Deaf - University of Hamburg. Άδεια: Creative Commons Attribution Non Commercial No Derivatives 4.0 International (https://creativecommons.org/licenses/by-nc-nd/4.0/legalcode, https://creativecommons.org/licenses/by-nc-nd/4.0/). Πηγή: http://hdl.handle.net/11500/ATHENA-0000-0000-28C5-5 (CLARIN:EL) DICTA-SIGN corpus by School of Computing Sciences - University of East Anglia, Research Institute of Computer Science - Paul Sabatier University and Institute of German Sign Language and Communication of the Deaf - University of Hamburg used under Creative Commons Attribution Non Commercial No Derivatives 4.0 International (https://creativecommons.org/licenses/by-nc-nd/4.0/legalcode, https://creativecommons.org/licenses/by-nc-nd/4.0/). Source: http://hdl.handle.net/11500/ATHENA-0000-0000-28C5-5 (CLARIN:EL) false false .. _LCRXML: 2. Lexical/Conceptual resources (LCR) *************************************** .. _Kelly: Monolingual LCR ======================== .. raw:: html

.. code-block:: xml 2015-12-08 2021-05-28 Person Person_Surname Person_Name http://w3id.org/meta-share/meta-share/CLARIN-SHARE Person Person_Surname Person_Name Αποθετήριο ΕΚ ΑΘΗΝΑ ATHENA RC Repository http://inventory.clarin.gr LanguageResource KELLY word-list Greek KELLY word-list EL Ο μονόγλωσσος πόρος KELLY EL αποτελεί μέρος του ψηφιακού εκπαιδευτικού υλικού που αναπτύχθηκε για εννέα γλώσσες, από τις οποίες 4 ευρέως διδασκόμενες (Αγγλικά, Αραβικά, Ρωσικά και Κινέζικα) και 5 λιγότερο διδασκόμενες (Ελληνικά, Ιταλικά, Σουηδικά, Πολωνικά και Νορβηγικά) με σκοπό να υποβοηθήσει την εκμάθησή τους ως ξένης/δεύτερης γλώσσας. Πιο συγκεκριμένα, πρόκειται για το ελληνικό κομμάτι ενός υλικού το οποίο ολοκληρώνεται από μία σειρά μονόγλωσσων και δίγλωσσων εγγραφών καλύπτοντας συνολικά 36 γλωσσικά ζεύγη. Η επιλογή των λέξεων βασίστηκε σε ψηφιακούς γλωσσικούς πόρους για κάθε γλώσσα, η επιλογή των οποίων ακολούθησε κοινές προδιαγραφές σε όλες τις γλώσσες, με στόχο την εξασφάλιση ομοιόμορφου γλωσσικού υλικού. Συνδυάστηκε δε με επεξεργασία των αποτελεσμάτων από γλωσσολόγους και εκπαιδευτικούς για την ένταξη κάθε λέξης στο κατάλληλο επίπεδο του Κοινού Ευρωπαϊκού Πλαισίου Αναφοράς (CEFR, <a href="https://www.coe.int/en/web/common-european-framework-reference-languages" target="_blank">Common European Framework of Reference for Languages</a>). Η επιλογή του λεξιλογίου υιοθέτησε διαδικασία εξαγωγής γνώσης από τα κείμενα (εφαρμογή ακολουθιών τεχνολογιών όπως δομική επισημείωση κειμένου, μορφολογική επισημείωση και λημματοποίηση), ακολουθούμενη από εφαρμογή στατιστικών μετρικών για την εξαγωγή των πιο συχνών (και άρα απαραίτητων στην εκμάθηση της γλώσσας) λέξεων. Η παιδαγωγική διάσταση καθόρισε τις αρχές επεξεργασίας των αποτελεσμάτων της γλωσσικής τεχνολογίας από ειδικούς, οι οποίοι αξιολόγησαν τα αποτελέσματα της αυτοματοποιημένης διαδικασίας εξαγωγής των λέξεων από τα κείμενα, κατέταξαν τις λέξεις στα επίπεδα του CEFR, και αντιστοίχισαν διαγλωσσικά τις λέξεις, καταρτίζοντας λίστες λεξιλογίου για 36 ζεύγη γλωσσών. Στην <a href="https://spraakbanken.gu.se/eng/kelly" target="_blank">επίσημη ιστοσελίδα</a> υπάρχουν περισσότερες πληροφορίες για το έργο Kelly. Από το ίδιο σημείο είναι διαθέσιμες οι λίστες λέξεων για τα Αγγλικά, Αραβικά, Ιταλικά, Κινέζικα, Νορβηγικά και Ρωσικά, ενώ υπάρχει και <a href="http://kelly.sketchengine.co.uk/" target="_blank">διεπαφή</a> που προσφέρει πρόσβαση στη βάση δεδομένων, όπου μπορεί κανείς να αναζητήσει μια λέξη και να δει τις μεταφράσεις της στις άλλες γλώσσες. The monolingual lexical conceptual resource KELLY EL is part of digital material created for educational purposes, i.e. to facilitate the learning of a foreign/second language. Nine different languages were involved, four commonly learned (English, Arabic, Russian and Chinese) and five less commonly learned (Greek, Italian, Swedish, Polish and Norwegian). More precisely, KELLY EL is the Greek part of a material which consists of monolingual and bilingual word-lists covering 36 language pairs in total. The choice of words was based for each language on digital language resources. The same standards were applied to all languages for the choice of these digital language resources in order to ensure uniformity. The material was analyzed and edited by linguists and education professionals and each word was mapped to the appropriate language level of the Common European Framework of Reference (CEFR, <a href="https://www.coe.int/en/web/common-european-framework-reference-languages" target="_blank">Common European Framework of Reference for Languages</a>). The vocabularies produced were created by extracting knowledge from texts (processes such as lemmatization, morphological and structural annotation were used) followed by statistical metrics techniques for extracting the most frequent (and therefore necessary for language learning) words. The language technology results were examined by experts according to pedagogical principles, evaluated and finally bilingual word-lists with words mapped to the different levels of CEFR for 36 language pairs were created. More information about Kelly is available from the <a href="https://spraakbanken.gu.se/eng/kelly" target="_blank">official project site</a>, where the lists in English, Arabic, Italian, Chinese, Norwegian and Russian can be downloaded. There is also a <a href="http://kelly.sketchengine.co.uk/" target="_blank">database interface </a>, which can be used to explore the links between words selected for each of these languages. http://hdl.handle.net/11500/ATHENA-0000-0000-25C1-C 1.0.0 (automatically assigned) person@ilsp.athena-innovation.gr Person Person_Surname Person_Name Ινστιτούτο Επεξεργασίας του Λόγου - Ερευνητικό Κέντρο Αθηνά (2015). KELLY word-list Greek. Version 1.0.0 (automatically assigned). [Dataset (Lexical/Conceptual Resource)]. CLARIN:EL. http://hdl.handle.net/11500/ATHENA-0000-0000-25C1-C Institute for Language and Speech Processing - Athena Research Center (2015). KELLY word-list Greek. Version 1.0.0 (automatically assigned). [Dataset (Lexical/Conceptual Resource)]. CLARIN:EL. http://hdl.handle.net/11500/ATHENA-0000-0000-25C1-C monolingual Organization Ινστιτούτο Επεξεργασίας του Λόγου Institute for Language and Speech Processing http://www.ilsp.gr/ Kilgarriff, Adam, Frieda Charalabopoulou, Maria Gavrilidou, Janne Bondi Johannessen, Saussan Khalil, Sofie Johansson Kokkinakis, Robert Lew, Serge Sharoff, Ravikiran Vadlapudi and Elena Volodina (2014) Corpus-based vocabulary lists for language learners for nine languages. Language Resources and Evaluation, 48:121–163 https://doi.org/10.1007/s10579-013-9251-2 KELLY word-lists http://hdl.handle.net/11500/ATHENA-0000-0000-5862-F 1.0.0 (automatically assigned) LexicalConceptualResource http://w3id.org/meta-share/meta-share/wordlist http://w3id.org/meta-share/meta-share/other LexicalConceptualResourceTextPart http://w3id.org/meta-share/meta-share/text http://w3id.org/meta-share/meta-share/monolingual el el http://w3id.org/meta-share/meta-share/downloadable http://www.hiddenLocation.org 7385.0 http://w3id.org/meta-share/meta-share/entry http://w3id.org/meta-share/meta-share/unspecified Creative Commons Attribution Non Commercial 4.0 International https://creativecommons.org/licenses/by-nc/4.0/legalcode https://creativecommons.org/licenses/by-nc/4.0/ http://w3id.org/meta-share/meta-share/attribution http://w3id.org/meta-share/meta-share/nonCommercialUse http://w3id.org/meta-share/meta-share/allowsDirectAccess http://w3id.org/meta-share/meta-share/allowsProcessing http://w3id.org/meta-share/meta-share/public CC-BY-NC-4.0 KELLY word-list Greek. Δημιουργός: Ινστιτούτο Επεξεργασίας του Λόγου - Ερευνητικό Κέντρο Αθηνά. Άδεια: Creative Commons Attribution Non Commercial 4.0 International (https://creativecommons.org/licenses/by-nc/4.0/legalcode, https://creativecommons.org/licenses/by-nc/4.0/). Πηγή: http://hdl.handle.net/11500/ATHENA-0000-0000-25C1-C (CLARIN:EL) KELLY word-list Greek by Institute for Language and Speech Processing - Athena Research Center used under Creative Commons Attribution Non Commercial 4.0 International (https://creativecommons.org/licenses/by-nc/4.0/legalcode, https://creativecommons.org/licenses/by-nc/4.0/). Source: http://hdl.handle.net/11500/ATHENA-0000-0000-25C1-C (CLARIN:EL) false false .. _OrossimoHistory: Bilingual LCR ======================== .. raw:: html

.. code-block:: xml 2016-12-07 2021-05-28 Person Person_Surname Person_Name http://w3id.org/meta-share/meta-share/CLARIN-SHARE Person Person_Surname Person_Name Αποθετήριο ΕΚ ΑΘΗΝΑ ATHENA RC Repository http://inventory.clarin.gr LanguageResource Ορολογικός Πόρος ΟΡΟΣΗΜΟ - Ιστορία Orossimo Terminological Resource - History Πρόκειται για δίγλωσσο ορολογικό πόρο που προέρχεται από αντίστοιχο σώμα κειμένων ακαδημαϊκού λόγου του γνωστικού τομέα της Ιστορίας. A bilingual terminological glossary extracted from academic discourse texts belonging to the History domain. http://hdl.handle.net/11500/ATHENA-0000-0000-4B4B-9 1.0.0 (automatically assigned) person@ilsp Person Person_Surname Person_Name Ινστιτούτο Επεξεργασίας του Λόγου - Ερευνητικό Κέντρο Αθηνά (2016). Ορολογικός Πόρος ΟΡΟΣΗΜΟ - Ιστορία. Version 1.0.0 (automatically assigned). [Dataset (Lexical/Conceptual Resource)]. CLARIN:EL. http://hdl.handle.net/11500/ATHENA-0000-0000-4B4B-9 Institute for Language and Speech Processing - Athena Research Center (2016). Orossimo Terminological Resource - History. Version 1.0.0 (automatically assigned). [Dataset (Lexical/Conceptual Resource)]. CLARIN:EL. http://hdl.handle.net/11500/ATHENA-0000-0000-4B4B-9 bilingual History DDC900 Organization Ινστιτούτο Επεξεργασίας του Λόγου Institute for Language and Speech Processing http://www.ilsp.gr/ 1996-01-01 1998-12-31 ΟΡΟΣΗΜΟ OROSSIMO https://www.ilsp.gr/projects/orosimo/ http://w3id.org/meta-share/meta-share/nationalFunds Organization General Secretariat for Research and Technology Σώμα κειμένων ΟΡΟΣΗΜΟ - Ιστορία OROSSIMO Corpus - History http://hdl.handle.net/11500/ATHENA-0000-0000-240F-8 1.0.0 (automatically assigned) Συλλογή ηλεκτρονικών ορολογικών πόρων: μεθοδολογία και αποτελέσματα Collection of digital terminological resources: methodology and results Ορολογικός Πόρος ΟΡΟΣΗΜΟ Orossimo Terminological Resource http://hdl.handle.net/11500/ATHENA-0000-0000-4B49-B 1.0.0 (automatically assigned) LexicalConceptualResource http://w3id.org/meta-share/meta-share/terminologicalResource http://w3id.org/meta-share/meta-share/semantics http://w3id.org/meta-share/meta-share/lemma http://w3id.org/meta-share/meta-share/domain1 http://w3id.org/meta-share/meta-share/derivation http://w3id.org/meta-share/meta-share/translationEquivalent http://w3id.org/meta-share/meta-share/note1 LexicalConceptualResourceTextPart http://w3id.org/meta-share/meta-share/text http://w3id.org/meta-share/meta-share/bilingual en en el el http://w3id.org/meta-share/meta-share/downloadable http://www.hiddenLocation.org http://fixme.com 2353.0 http://w3id.org/meta-share/meta-share/term http://w3id.org/meta-share/omtd-share/MsExcel Creative Commons Attribution 4.0 International https://creativecommons.org/licenses/by/4.0/legalcode https://creativecommons.org/licenses/by/4.0/ http://w3id.org/meta-share/meta-share/attribution http://w3id.org/meta-share/meta-share/allowsDirectAccess http://w3id.org/meta-share/meta-share/allowsProcessing http://w3id.org/meta-share/meta-share/public CC-BY-4.0 Ορολογικός Πόρος ΟΡΟΣΗΜΟ - Ιστορία. Δημιουργός: Ινστιτούτο Επεξεργασίας του Λόγου - Ερευνητικό Κέντρο Αθηνά. Άδεια: Creative Commons Attribution 4.0 International (https://creativecommons.org/licenses/by/4.0/legalcode, https://creativecommons.org/licenses/by/4.0/). Πηγή: http://hdl.handle.net/11500/ATHENA-0000-0000-4B4B-9 (CLARIN:EL) Orossimo Terminological Resource - History by Institute for Language and Speech Processing - Athena Research Center used under Creative Commons Attribution 4.0 International (https://creativecommons.org/licenses/by/4.0/legalcode, https://creativecommons.org/licenses/by/4.0/). Source: http://hdl.handle.net/11500/ATHENA-0000-0000-4B4B-9 (CLARIN:EL) false false .. _TrilingualLex: Multilingual LCR ======================== .. raw:: html

.. code-block:: xml 2019-03-27 2021-05-28 Person Person_Surname Person_Name http://w3id.org/meta-share/meta-share/CLARIN-SHARE Person Person_Surname Person_Name Αποθετήριο ΕΚ ΑΘΗΝΑ ATHENA RC Repository http://inventory.clarin.gr LanguageResource Τρίγλωσσο Ορολογικό Λεξικό (ΤΟΛ) Trilingual Term Dictionary (TTD) Το Τρίγλωσσο Ορολογικό Λεξικό (ΤΟΛ) απευθύνεται σε αλλόφωνους μαθητές Γυμνασίου της μουσουλμανικής μειονότητας στη Θράκη με στόχο τη διευκόλυνση της μαθησιακής τους πορείας στο ελληνικό Γυμνάσιο. Περιλαμβάνει όρους που χρησιμοποιούνται στη διδασκαλία δεκατριών γνωστικών αντικειμένων –βιολογίας, γεωγραφίας, ιστορίας, κοινωνικής και πολιτικής αγωγής, λογοτεχνίας, μαθηματικών, μουσικής, νεοελληνικής γλώσσας, οικιακής οικονομίας, πληροφορικής, τεχνολογίας, φυσικής και χημείας– με βάση το αναλυτικό πρόγραμμα του Γυμνασίου. Οι όροι του ΤΟΛ έχουν συλλεχθεί από τα σχολικά εγχειρίδια που χρησιμοποιούνται στη διδασκαλία των αντίστοιχων μαθημάτων. Οι όροι είναι ταξινομημένοι ανά γνωστικό αντικείμενο και συνοδεύονται από εύληπτους και επιστημονικά ελεγμένους ορισμούς και μεταφράσεις στα Τουρκικά και τα Αγγλικά. The Trilingual Term Dictionary (TTD) is targeted to foreign students of the secondary school in Thrace, Greece. The aim of the TTD is threefold: to assist the student in learning the subject areas of the curriculum, to improve their language skills in Greek and to familiarize themselves with information technology. TTD contains terms that are used in several subject areas (e.g. biology, geography, history, social and political studies etc.) that are taught in the secondary school. The terms of TDD (more than 5.000) have been collected from the schoolbooks. The terms are categorised within the subject areas, accompanied by definitions and translated into English and Turkish. http://hdl.handle.net/11500/ATHENA-0000-0000-5837-0 1.0.0 (automatically assigned) http://www.ilsp.gr/tol/ Person Person_Surname Person_Name Ινστιτούτο Επεξεργασίας του Λόγου - Ερευνητικό Κέντρο Αθηνά (2019). Τρίγλωσσο Ορολογικό Λεξικό (ΤΟΛ). Version 1.0.0 (automatically assigned). [Dataset (Lexical/Conceptual Resource)]. CLARIN:EL. http://hdl.handle.net/11500/ATHENA-0000-0000-5837-0 Institute for Language and Speech Processing - Athena Research Center (2019). Trilingual Term Dictionary (TTD). Version 1.0.0 (automatically assigned). [Dataset (Lexical/Conceptual Resource)]. CLARIN:EL. http://hdl.handle.net/11500/ATHENA-0000-0000-5837-0 multilingual Mathematics DDC510 Physics DDC530 Biology DDC570 Technology DDC600 Home & Family Management DDC640 Chemistry DDC540 Language DDC400 Geography & Travel DDC910 Music DDC780 History DDC900 Literature, Rhetoric & Criticism DDC800 Computer Science, Information & General Works DDC000 Organization Ινστιτούτο Επεξεργασίας του Λόγου Institute for Language and Speech Processing http://www.ilsp.gr/ Τρίγλωσσο Ορολογικό Λεξικό Trilingual Terminological Dictionary https://bit.ly/2V4hWLe https://www.ilsp.gr/projects/tol/ http://w3id.org/meta-share/meta-share/euFunds http://w3id.org/meta-share/meta-share/nationalFunds Organization Ministry of Education and Religious Affairs Organization Ευρωπαϊκή Επιτροπή European Commission https://ec.europa.eu/info/index_en LexicalConceptualResource http://w3id.org/meta-share/meta-share/terminologicalResource http://w3id.org/meta-share/meta-share/other LexicalConceptualResourceTextPart http://w3id.org/meta-share/meta-share/text http://w3id.org/meta-share/meta-share/multilingual el el en en tr tr http://w3id.org/meta-share/meta-share/downloadable http://www.hiddenLocation.org http://fixme.com 5224.0 http://w3id.org/meta-share/meta-share/entry http://w3id.org/meta-share/omtd-share/Xml Creative Commons Attribution Non Commercial No Derivatives 4.0 International https://creativecommons.org/licenses/by-nc-nd/4.0/legalcode https://creativecommons.org/licenses/by-nc-nd/4.0/ http://w3id.org/meta-share/meta-share/attribution http://w3id.org/meta-share/meta-share/nonCommercialUse http://w3id.org/meta-share/meta-share/noDerivatives http://w3id.org/meta-share/meta-share/allowsDirectAccess http://w3id.org/meta-share/meta-share/public CC-BY-NC-ND-4.0 Τρίγλωσσο Ορολογικό Λεξικό (ΤΟΛ). Δημιουργός: Ινστιτούτο Επεξεργασίας του Λόγου - Ερευνητικό Κέντρο Αθηνά. Άδεια: Creative Commons Attribution Non Commercial No Derivatives 4.0 International (https://creativecommons.org/licenses/by-nc-nd/4.0/legalcode, https://creativecommons.org/licenses/by-nc-nd/4.0/). Πηγή: http://hdl.handle.net/11500/ATHENA-0000-0000-5837-0 (CLARIN:EL) Trilingual Term Dictionary (TTD) by Institute for Language and Speech Processing - Athena Research Center used under Creative Commons Attribution Non Commercial No Derivatives 4.0 International (https://creativecommons.org/licenses/by-nc-nd/4.0/legalcode, https://creativecommons.org/licenses/by-nc-nd/4.0/). Source: http://hdl.handle.net/11500/ATHENA-0000-0000-5837-0 (CLARIN:EL) false false .. _ToolXML: 3. Tool/Services ******************* .. _LangIdentifier: Single tool ================ .. raw:: html

.. code-block:: xml 2015-09-14 2021-05-28 Person Person_Surname Person_Name http://w3id.org/meta-share/meta-share/CLARIN-SHARE Person Person_Surname Person_Name Αποθετήριο ΕΚ ΑΘΗΝΑ ATHENA RC Repository http://inventory.clarin.gr LanguageResource Αναγνωριστής γλώσσας ψηφιακού κειμένου ΙΕΛ ILSP Language Identification System ILSP LangId Ο Αναγνωριστής γλώσσας του ΙΕΛ είναι ένα εργαλείο που χρησιμοποιείται για την αυτόματη αναγνώριση της γλώσσας ενός ψηφιακού κειμένου. Το εργαλείο αναγνωρίζει την ελληνική, αγγλική, γερμανική, γαλλική, ολλανδική γλώσσα καθώς και για τα greeklish, ενώ δίνεται και η δυνατότητα προσθήκης επιπλέον γλωσσών με την προσθήκη συμπληρωματικών αρχείων για κάθε γλώσσα. The ILSP Language Identification System is a tool used for language identification in digital texts. The tool performs language identification for Greek, Greeklish, English, German, Dutch and French; it can also be used for other languages upon provision of specific supplementary external files for each new language. http://hdl.handle.net/11500/ATHENA-0000-0000-23E7-4 1.0.0 (automatically assigned) person@phs.uoa.gr Person Person_Surname Person_Name Ινστιτούτο Επεξεργασίας του Λόγου - Ερευνητικό Κέντρο Αθηνά (2015). Αναγνωριστής γλώσσας ψηφιακού κειμένου ΙΕΛ. Version 1.0.0 (automatically assigned). [Software (Tool/Service)]. CLARIN:EL. http://hdl.handle.net/11500/ATHENA-0000-0000-23E7-4 Institute for Language and Speech Processing - Athena Research Center (2015). ILSP Language Identification System. Version 1.0.0 (automatically assigned). [Software (Tool/Service)]. CLARIN:EL. http://hdl.handle.net/11500/ATHENA-0000-0000-23E7-4 text Organization Ινστιτούτο Επεξεργασίας του Λόγου Institute for Language and Speech Processing http://www.ilsp.gr/ http://w3id.org/meta-share/omtd-share/LanguageTechnology http://w3id.org/meta-share/omtd-share/LanguageIdentification Αναγνώριση γλώσσας ηλεκτρονικού κειμένου Language identification in digital texts http://www.ilsp.gr/homepages/protopapas/pdf/Protopapas_2004_LangIDsubm.pdf ToolService http://w3id.org/meta-share/omtd-share/LanguageIdentification http://w3id.org/meta-share/meta-share/sourceCode http://www.hiddenLocation.org BSD 2-Clause "Simplified" License https://opensource.org/licenses/BSD-2-Clause http://w3id.org/meta-share/meta-share/unspecified http://w3id.org/meta-share/meta-share/allowsDirectAccess http://w3id.org/meta-share/meta-share/public BSD-2-Clause Αναγνωριστής γλώσσας ψηφιακού κειμένου ΙΕΛ. Δημιουργός: Ινστιτούτο Επεξεργασίας του Λόγου - Ερευνητικό Κέντρο Αθηνά. Άδεια: BSD 2-Clause "Simplified" License (https://opensource.org/licenses/BSD-2-Clause). Πηγή: http://hdl.handle.net/11500/ATHENA-0000-0000-23E7-4 (CLARIN:EL) ILSP Language Identification System by Institute for Language and Speech Processing - Athena Research Center used under BSD 2-Clause "Simplified" License (https://opensource.org/licenses/BSD-2-Clause). Source: http://hdl.handle.net/11500/ATHENA-0000-0000-23E7-4 (CLARIN:EL) true http://w3id.org/meta-share/meta-share/corpus el-Latn el Latn Greeklish el-Grek el Grek fr fr en en de de nl nl http://w3id.org/meta-share/meta-share/text http://w3id.org/meta-share/meta-share/corpus el-Latn el Latn Greeklish el-Grek el Grek fr fr en en de de nl nl http://w3id.org/meta-share/meta-share/text false .. _Voyant: Combined tools ================ .. raw:: html

.. code-block:: xml 2019-04-02 2021-05-28 Person Person_Surname Person_Name http://w3id.org/meta-share/meta-share/CLARIN-SHARE Person Person_Surname Person_Name Αποθετήριο ΕΚ ΑΘΗΝΑ ATHENA RC Repository http://inventory.clarin.gr LanguageResource Voyant Tools Τα Voyant Tools είναι ένα διαδικτυακό περιβάλλον ανοιχτού κώδικα που χρησιμοποιείται για την ανάγνωση και ανάλυση κειμένου ή/και σωμάτων κειμένων. Το περιβάλλον των Voyant Tools σχεδιάστηκε και υλοποιήθηκε από τους Stéfan Sinclair και Geoffrey Rockwell το 2016 με στόχο να διευκολύνει τους φοιτητές και ακαδημαϊκούς των ψηφιακών ανθρωπιστικών επιστημών αναφορικά με τις πρακτικές ανάγνωσης και ερμηνείας κειμενικού υλικού. Συγκεκριμένα, μπορείτε να χρησιμοποιήσετε τα Voyant Tools για: - να δείτε πώς επιτυγχάνεται η κειμενική ανάλυση με τη βοήθεια υπολογιστικών εργαλείων, - να μελετήσετε κείμενα που βρίσκετε στο διαδίκτυο ή κείμενα που έχετε επεξεργαστεί και τα οποία βρίσκονται αποθηκευμένα τοπικά στον Η/Υ σας, - να προσθέσετε επιπλέον λειτουργικότητες στις διαδικτυακές σας συλλογές, περιοδικά, ιστολόγια ή ιστότοπους, ώστε να μπορεί κανείς να δει το υλικό αυτό με τη βοήθεια εργαλείων ανάλυσης, - να προσθέσετε διαδραστικά στοιχεία στα κείμενα ή στα άρθρα που δημοσιεύετε στο διαδίκτυο ή ακόμα και να προσθέσετε διαδραστικά πάνελ στις ερευνητικές σας εκθέσεις (εφόσον αυτές μπορούν να δημοσιευθούν στο διαδίκτυο), ώστε οι αναγνώστες σας να μπορούν πολύ γρήγορα να βρουν τα βασικά σημεία και τα αποτελέσματα της έρευνάς σας, - να αναπτύξετε τα δικά σας εργαλεία χρησιμοποιώντας τις λειτουργικότητες και τον κώδικα των Voyant Tools. Voyant Tools is a web-based text reading and analysis environment. It is a scholarly project that is designed to facilitate reading and interpretive practices for digital humanities students and scholars as well as for the general public. What you can do with Voyant: --Use it to learn how computers-assisted analysis works. Check out our examples that show you how to do real academic tasks with Voyant. --Use it to study texts that you find on the web or texts that you have carefully edited and have on your computer. --Use it to add functionality to your online collections, journals, blogs or web sites so others can see through your texts with analytical tools. --Use it to add interactive evidence to your essays that you publish online. Add interactive panels right into your research essays (if they can be published online) so your readers can recapitulate your results. --Use it to develop your own tools using our functionality and code. http://hdl.handle.net/11500/ATHENA-0000-0000-5827-2 1.0.0 (automatically assigned) https://voyant-tools.org/ Person Person_Surname Person_Name Person Person_Surname Person_Name Rockwell, Geoffrey; Sinclair, Stéfan (2019). Voyant Tools. Version 1.0.0 (automatically assigned). [Software (Tool/Service)]. CLARIN:EL. http://hdl.handle.net/11500/ATHENA-0000-0000-5827-2 text Person Person_Surname Person_Name Person Person_Surname Person_Name ToolService http://w3id.org/meta-share/omtd-share/LinguisticAnalysis http://w3id.org/meta-share/meta-share/webService https://voyant-tools.org/ http://w3id.org/meta-share/meta-share/unspecified GNU General Public License v3.0 or later https://www.gnu.org/licenses/gpl-3.0-standalone.html https://opensource.org/licenses/GPL-3.0 http://w3id.org/meta-share/meta-share/unspecified http://w3id.org/meta-share/meta-share/allowsDirectAccess http://w3id.org/meta-share/meta-share/public GPL-3.0-or-later Creative Commons Attribution 4.0 International https://creativecommons.org/licenses/by/4.0/legalcode https://creativecommons.org/licenses/by/4.0/ http://w3id.org/meta-share/meta-share/attribution http://w3id.org/meta-share/meta-share/allowsDirectAccess http://w3id.org/meta-share/meta-share/allowsProcessing http://w3id.org/meta-share/meta-share/public CC-BY-4.0 Voyant Tools. Δημιουργός: Geoffrey Rockwell and Stéfan Sinclair. Άδεια: GNU General Public License v3.0 or later (https://www.gnu.org/licenses/gpl-3.0-standalone.html, https://opensource.org/licenses/GPL-3.0) and Creative Commons Attribution 4.0 International (https://creativecommons.org/licenses/by/4.0/legalcode, https://creativecommons.org/licenses/by/4.0/). Πηγή: http://hdl.handle.net/11500/ATHENA-0000-0000-5827-2 (CLARIN:EL) Voyant Tools by Geoffrey Rockwell and Stéfan Sinclair used under GNU General Public License v3.0 or later (https://www.gnu.org/licenses/gpl-3.0-standalone.html, https://opensource.org/licenses/GPL-3.0) and Creative Commons Attribution 4.0 International (https://creativecommons.org/licenses/by/4.0/legalcode, https://creativecommons.org/licenses/by/4.0/). Source: http://hdl.handle.net/11500/ATHENA-0000-0000-5827-2 (CLARIN:EL) false http://w3id.org/meta-share/meta-share/corpus http://w3id.org/meta-share/meta-share/text http://w3id.org/meta-share/omtd-share/Pdf http://w3id.org/meta-share/omtd-share/Rtf http://w3id.org/meta-share/omtd-share/Xml http://w3id.org/meta-share/omtd-share/ConllU http://w3id.org/meta-share/omtd-share/Html http://w3id.org/meta-share/meta-share/corpus http://w3id.org/meta-share/meta-share/image http://w3id.org/meta-share/meta-share/corpus http://w3id.org/meta-share/meta-share/text false .. _LDXML: 4. Language Descriptions ************************** .. raw:: html

.. code-block:: xml 2015-09-10 2021-05-28 Person Person_Surname Person_Name http://w3id.org/meta-share/meta-share/CLARIN-SHARE Person Person_Surname Person_Name Αποθετήριο ΕΚ ΑΘΗΝΑ ATHENA RC Repository http://inventory.clarin.gr LanguageResource PANACEA σώμα ελληνικών n-γραμμάτων (n-grams) περιβαλλοντικού τομέα PANACEA Environment Corpus n-grams EL (Greek) Ο συγκεκριμένος πόρος περιλαμβάνει ελληνικά n-γράμματα (για n = 1 – 5) λέξεων και n-γράμματα λέξης/μορφοσυντακτικού χαρακτηρισμού/λήμματος στον τομέα του περιβάλλοντος, τα οποία συνοδεύονται από τις αντίστοιχες συχνότητες εμφάνισης στο σώμα κειμένων. Ο πόρος αναπτύχθηκε στο πλαίσιο του έργου PANACEA (http://www.panacea-lr.eu), που χρηματοδοτήθηκε από το 7ο ΠΠ. Βασίστηκε σε ιστοσελίδες που ανακτήθηκαν αυτόματα από το διαδίκτυο με χρήση του εργαλείου FPC, αφού αναγνωρίστηκε αυτόματα ότι τα κείμενα είναι γραμμένα στην ελληνική και είναι σχετικά με το περιβάλλον. Το κειμενικό σώμα αποτελείται από περίπου 31,71 εκατ. μονάδες. Η συλλογή των δεδομένων έγινε το καλοκαίρι του 2011. PANACEA Environment Corpus n-grams EL (Greek) 1.0 contains Greek word n-grams and Greek word/tag/lemma n-grams in the "Environment" (ENV) domain. N-grams are accompanied by their observed frequency counts. The length of the n-grams ranges from unigrams (single words) to five-grams. The data were collected in the context of PANACEA (http://www.panacea-lr.eu), an EU-FP7 Funded Project under Grant Agreement 248064. The n-gram counts were generated from crawled Web pages that were automatically detected to be in the Greek language and were automatically classified as relevant to the ENV domain. The collection consisted of approximately 31.71 million tokens. Data collection took place in the summer of 2011. http://hdl.handle.net/11500/ATHENA-0000-0000-23DA-3 1.0 http://nlp.ilsp.gr/panacea/D4.3/data/201209/gms/env_el/README.txt Person Person_Surname Person_Name Person Person_Surname Person_Name Ινστιτούτο Επεξεργασίας του Λόγου - Ερευνητικό Κέντρο Αθηνά (2015). PANACEA σώμα ελληνικών n-γραμμάτων (n-grams) περιβαλλοντικού τομέα. Version 1.0. [Model (n-gram model)]. CLARIN:EL. http://hdl.handle.net/11500/ATHENA-0000-0000-23DA-3 Institute for Language and Speech Processing - Athena Research Center (2015). PANACEA Environment Corpus n-grams EL (Greek). Version 1.0. [Model (n-gram model)]. CLARIN:EL. http://hdl.handle.net/11500/ATHENA-0000-0000-23DA-3 monolingual environment Clarin_Domain002 Organization Ινστιτούτο Επεξεργασίας του Λόγου Institute for Language and Speech Processing http://www.ilsp.gr/ 2011-06-01 2011-08-31 Platform for Automatic, Normalized Annotation and Cost-Effective Acquisition of Language Resources for Human Language http://www.panacea-lr.eu http://panacea-lr.eu/ ICT-248064 http://w3id.org/meta-share/meta-share/euFunds Organization Ευρωπαϊκή Επιτροπή European Commission https://ec.europa.eu/info/index_en automatic web crawling, automatic language detection, data preprocessing (boilerpipe filtering, lemmatization & tagging) boilerpipe library https://code.google.com/archive/p/boilerpipe/ unspecified ILSP Lemmatizer unspecified ILSP Feature-based multi-tiered POS Tagger unspecified PANACEA Environment Corpus n-grams EL 1.0 README http://nlp.ilsp.gr/panacea/D4.3/data/201209/gms/env_el/README.txt LanguageDescription NGramModel http://w3id.org/meta-share/meta-share/word 5 LanguageDescriptionTextPart http://w3id.org/meta-share/meta-share/text http://w3id.org/meta-share/meta-share/monolingual el el http://w3id.org/meta-share/meta-share/downloadable http://www.hiddenLocation.org http://nlp.ilsp.gr/panacea/D4.3/data/201209/gms/env_el/ENV_EL_1000.3gms.sample http://nlp.ilsp.gr/panacea/D4.3/data/201209/gms/env_el/ENV_EL_wpl_1000.3gms.sample 14954020.0 http://w3id.org/meta-share/meta-share/five-gram 13683940.0 http://w3id.org/meta-share/meta-share/four-gram 3860716.0 http://w3id.org/meta-share/meta-share/bigram 9767383.0 http://w3id.org/meta-share/meta-share/trigram 435189.0 http://w3id.org/meta-share/meta-share/unigram http://w3id.org/meta-share/omtd-share/Text Creative Commons Attribution Share Alike 4.0 International https://creativecommons.org/licenses/by-sa/4.0/legalcode https://creativecommons.org/licenses/by-sa/4.0/ http://w3id.org/meta-share/meta-share/attribution http://w3id.org/meta-share/meta-share/shareAlike http://w3id.org/meta-share/meta-share/allowsDirectAccess http://w3id.org/meta-share/meta-share/allowsProcessing http://w3id.org/meta-share/meta-share/public CC-BY-SA-4.0 PANACEA σώμα ελληνικών n-γραμμάτων (n-grams) περιβαλλοντικού τομέα. Δημιουργός: Ινστιτούτο Επεξεργασίας του Λόγου - Ερευνητικό Κέντρο Αθηνά. Άδεια: Creative Commons Attribution Share Alike 4.0 International (https://creativecommons.org/licenses/by-sa/4.0/legalcode, https://creativecommons.org/licenses/by-sa/4.0/). Πηγή: http://hdl.handle.net/11500/ATHENA-0000-0000-23DA-3 (CLARIN:EL) PANACEA Environment Corpus n-grams EL (Greek) by Institute for Language and Speech Processing - Athena Research Center used under Creative Commons Attribution Share Alike 4.0 International (https://creativecommons.org/licenses/by-sa/4.0/legalcode, https://creativecommons.org/licenses/by-sa/4.0/). Source: http://hdl.handle.net/11500/ATHENA-0000-0000-23DA-3 (CLARIN:EL) false false