Search results

Schultze-Berndt

4
1
10

This corpus contains sound recordings and transcripts of two dialects of an Australian Aboriginal Language, Jaminjung an…

This corpus contains sound recordings and transcripts of two dialects of an Australian Aboriginal Language, Jaminjung and Ngaliwurru. The materials were recorded and transcribed by Eva Schultze-Berndt between 1993 and 1998.; This subcorpus contains transcripts annotations and recordings of Jaminjung data; This file giv…

Djamindjung Kriol Ngarinman

Landing page for this record at archive.mpi.nl

VCR

Documentation of the Trumai Language

3
7

The main purpose of this archive is the documentation of Trumai, a genetically isolate language spoken in Brazil (Xingu …

The main purpose of this archive is the documentation of Trumai, a genetically isolate language spoken in Brazil (Xingu reserve). Trumai is an endangered language, with a reduced number of speakers. The archive has linguistic and non-linguistic materials, as well as some studies about the Trumai language and culture. T…

Landing page for this record at archive.mpi.nl

VCR

Savosavo and Gela

3
3
3

This archive has been created for the documentation of the Savosavo language together with the people of Savo Island and…

This archive has been created for the documentation of the Savosavo language together with the people of Savo Island and the Florida Islands, Central Province, Solomon Islands. The corpus, which is still under construction, contains data on two neighboring but unrelated languages, Savosavo and Gela. In addition, it pre…

Landing page for this record at archive.mpi.nl

VCR

The LIA Treebank

(Part of Clarino - Textlab)

4

The LIA Treebank includes 7536 speech segments and 77 701 tokens from LIA Norwegian. The treebank is annotated with morp…

The LIA Treebank includes 7536 speech segments and 77 701 tokens from LIA Norwegian. The treebank is annotated with morphological and dependency-style syntactic analysis and manually corrected. The treebank is available in three versions: A downloadable version in conllx format, a searchable version in the search inter…

Norwegian Norwegian Ny..

Landing page for this record at tekstlab.uio.no

VCR

The SemDaX Corpus

(Part of CLARIN-DK-UCPH Repository)

1
2
2

The SemDax Corpus is a Danish human-annotated corpus relying on the combined wordnet and dictionary resources: DanNet an…

The SemDax Corpus is a Danish human-annotated corpus relying on the combined wordnet and dictionary resources: DanNet and Den Danske Ordbog, and available through a CLARIN academic license. The corpus includes approx. 90,000 words, comprises six textual domains, and is annotated with sense inventories of different gran…

Danish

Landing page for this record

VCR

DK-CLARIN LSP Corpus - IT domain

(Part of CLARIN-DK-UCPH Repository)

7
3

Texts in the IT Domain come from Libris, Open Office, Aktuel Naturvidenskab and have been collected in the DK-CLARIN pro…

Texts in the IT Domain come from Libris, Open Office, Aktuel Naturvidenskab and have been collected in the DK-CLARIN project, WP2.2, 2008 - 2011. The corpus consists of 1,101,059 words in 66 files. Communicative setting/Number of files: expert->advanced (5) expert->basic (61). All texts are in XML TEIP5 format (TE…

Danish

Landing page for this record

VCR

DK-CLARIN Rapid Parallel Corpus 1993-2003 (da-en-de)

(Part of CLARIN-DK-UCPH Repository)

3
3

The corpus consists of press releases from the European Commission Press Relase Database (Rapid) harvested in 2009 (http…

The corpus consists of press releases from the European Commission Press Relase Database (Rapid) harvested in 2009 (http://europa.eu/rapid/search.htm). Each of the 5330 press releases (files) exist in Danish, English and German with app. 3,000,000 words for each language. All texts are in XML TEIP5 format (TEIP5DKCLA…

Danish English German

Landing page for this record

VCR

DK-CLARIN LSP Corpus - Health domain

(Part of CLARIN-DK-UCPH Repository)

7
8

Texts in the Health and Medicine Domain come from netpatient.dk, Søfartsstyrelsen, Sundhedsstyrelsen, regionH, Libris, A…

Texts in the Health and Medicine Domain come from netpatient.dk, Søfartsstyrelsen, Sundhedsstyrelsen, regionH, Libris, Aktuel Naturvidenskab and have been collected in the DK-CLARIN project, WP2.2, 2008 - 2011. The corpus consists of 3,972,573 words in 3273 files. Communicative setting/Number of files: expert->expe…

Danish

Landing page for this record

VCR

DK-CLARIN LSP Corpus - Nanotechnology domain

(Part of CLARIN-DK-UCPH Repository)

7
5

Texts in the Nanotechnology domain come from iNano (Interdisciplinary Nanoscience Center, AU), Nano (DTU), Niels Bohr In…

Texts in the Nanotechnology domain come from iNano (Interdisciplinary Nanoscience Center, AU), Nano (DTU), Niels Bohr Institutet, Forskningscenter Risø, Ministeriet for Sundhed og Forebyggelse (via DTU), Miljøstyrelsen, Aktuel Naturvidenskab and have been collected in the DK-CLARIN project, WP2.2, 2008 - 2011. The co…

Danish

Landing page for this record

VCR

DK-CLARIN LSP Corpus - Agriculture domain

(Part of CLARIN-DK-UCPH Repository)

7
1

Texts in the Agriculture domain come from Danmarks JordbrugsForskning and have been collected in the DK-CLARIN project,…

Texts in the Agriculture domain come from Danmarks JordbrugsForskning and have been collected in the DK-CLARIN project, WP2.2, 2008 - 2011. The corpus consists of 2,376,029 words in 216 files. Communicative setting/Number of files: expert->expert (45) expert->advanced (24) expert->basic (142) advanced->basic (5). …

Danish

Landing page for this record

VCR

CLARIN Virtual Language Observatory

Facets

Language

Collection

Resource type

Modality

Format

Keyword

Genre

Subject

Country

Organisation

Data provider

National project

Search options

Temporal Coverage

Availability

Search options

Schultze-Berndt

Documentation of the Trumai Language

Savosavo and Gela

The LIA Treebank

The SemDaX Corpus

DK-CLARIN LSP Corpus - IT domain

DK-CLARIN Rapid Parallel Corpus 1993-2003 (da-en-de)

DK-CLARIN LSP Corpus - Health domain

DK-CLARIN LSP Corpus - Nanotechnology domain

DK-CLARIN LSP Corpus - Agriculture domain