Search results

Schultze-Berndt

4
1
10

This corpus contains sound recordings and transcripts of two dialects of an Australian Aboriginal Language, Jaminjung an…

This corpus contains sound recordings and transcripts of two dialects of an Australian Aboriginal Language, Jaminjung and Ngaliwurru. The materials were recorded and transcribed by Eva Schultze-Berndt between 1993 and 1998.; This subcorpus contains transcripts annotations and recordings of Jaminjung data; This file giv…

Djamindjung Kriol Ngarinman

Landing page for this record at archive.mpi.nl

VCR

Documentation of the Trumai Language

3
7

The main purpose of this archive is the documentation of Trumai, a genetically isolate language spoken in Brazil (Xingu …

The main purpose of this archive is the documentation of Trumai, a genetically isolate language spoken in Brazil (Xingu reserve). Trumai is an endangered language, with a reduced number of speakers. The archive has linguistic and non-linguistic materials, as well as some studies about the Trumai language and culture. T…

Landing page for this record at archive.mpi.nl

VCR

Savosavo and Gela

3
3
3

This archive has been created for the documentation of the Savosavo language together with the people of Savo Island and…

This archive has been created for the documentation of the Savosavo language together with the people of Savo Island and the Florida Islands, Central Province, Solomon Islands. The corpus, which is still under construction, contains data on two neighboring but unrelated languages, Savosavo and Gela. In addition, it pre…

Landing page for this record at archive.mpi.nl

VCR

DK-CLARIN Reference Corpus of General Danish

(Part of CLARIN-DK-UCPH Repository)

4
8

DK-CLARIN Reference Corpus of General Danish has been collected as part of DK-CLARIN project, WP2.1, 2008 - 2011. All te…

DK-CLARIN Reference Corpus of General Danish has been collected as part of DK-CLARIN project, WP2.1, 2008 - 2011. All texts are in XML TEIP5 format (TEIP5DKCLARIN-format), with tokenisation, ePOS-tagging, sentence and paragraph segmentation, and lemmatisation. The corpus comprises 45,113,245 words.

Danish

Landing page for this record

VCR

DK-CLARIN LSP Corpus - Health domain

(Part of CLARIN-DK-UCPH Repository)

7
8

Texts in the Health and Medicine Domain come from netpatient.dk, Søfartsstyrelsen, Sundhedsstyrelsen, regionH, Libris, A…

Texts in the Health and Medicine Domain come from netpatient.dk, Søfartsstyrelsen, Sundhedsstyrelsen, regionH, Libris, Aktuel Naturvidenskab and have been collected in the DK-CLARIN project, WP2.2, 2008 - 2011. The corpus consists of 3,972,573 words in 3273 files. Communicative setting/Number of files: expert->expe…

Danish

Landing page for this record

VCR

DK-CLARIN LSP Corpus - Agriculture domain

(Part of CLARIN-DK-UCPH Repository)

7
1

Texts in the Agriculture domain come from Danmarks JordbrugsForskning and have been collected in the DK-CLARIN project,…

Texts in the Agriculture domain come from Danmarks JordbrugsForskning and have been collected in the DK-CLARIN project, WP2.2, 2008 - 2011. The corpus consists of 2,376,029 words in 216 files. Communicative setting/Number of files: expert->expert (45) expert->advanced (24) expert->basic (142) advanced->basic (5). …

Danish

Landing page for this record

VCR

DK-CLARIN LSP Corpus - Environment domain

(Part of CLARIN-DK-UCPH Repository)

7
4

Texts in the Environment Domain come from Hovedland, Danske Miljøundersøgelser, Det Økologiske Råd and Aktuel Naturviden…

Texts in the Environment Domain come from Hovedland, Danske Miljøundersøgelser, Det Økologiske Råd and Aktuel Naturvidenskab(via DMI). The corpus consists of 1,478,298 words in 93 files. Communicative setting/Number of files: expert->expert (2) expert->advanced (23) expert->basic (68). All texts are in XML TEIP5 fo…

Danish

Landing page for this record

VCR

DK-CLARIN LSP Corpus - IT domain

(Part of CLARIN-DK-UCPH Repository)

7
3

Texts in the IT Domain come from Libris, Open Office, Aktuel Naturvidenskab and have been collected in the DK-CLARIN pro…

Texts in the IT Domain come from Libris, Open Office, Aktuel Naturvidenskab and have been collected in the DK-CLARIN project, WP2.2, 2008 - 2011. The corpus consists of 1,101,059 words in 66 files. Communicative setting/Number of files: expert->advanced (5) expert->basic (61). All texts are in XML TEIP5 format (TE…

Danish

Landing page for this record

VCR

DK-CLARIN Parallel Financial Corpus (da-en)

(Part of CLARIN-DK-UCPH Repository)

5
2

The DK-CLARIN Parallel Financial Corpus comprises 4.3 M Danish and 4.8 M English tokens from translated (parallel) docum…

The DK-CLARIN Parallel Financial Corpus comprises 4.3 M Danish and 4.8 M English tokens from translated (parallel) documents, mainly annual reports, of the period 2002-2010 from 12 of the biggest Danish companies. All texts are in XML TEIP5 format (TEIP5DKCLARIN-format), with tokenisation, pos-tagging, sentence and pa…

Danish English

Landing page for this record

VCR

The SemDaX Corpus

(Part of CLARIN-DK-UCPH Repository)

1
2
2

The SemDax Corpus is a Danish human-annotated corpus relying on the combined wordnet and dictionary resources: DanNet an…

The SemDax Corpus is a Danish human-annotated corpus relying on the combined wordnet and dictionary resources: DanNet and Den Danske Ordbog, and available through a CLARIN academic license. The corpus includes approx. 90,000 words, comprises six textual domains, and is annotated with sense inventories of different gran…

Danish

Landing page for this record

VCR

CLARIN Virtual Language Observatory

Facets

Language

Collection

Resource type

Modality

Format

Keyword

Genre

Subject

Country

Organisation

Data provider

National project

Search options

Temporal Coverage

Availability

Search options

Schultze-Berndt

Documentation of the Trumai Language

Savosavo and Gela

DK-CLARIN Reference Corpus of General Danish

DK-CLARIN LSP Corpus - Health domain

DK-CLARIN LSP Corpus - Agriculture domain

DK-CLARIN LSP Corpus - Environment domain

DK-CLARIN LSP Corpus - IT domain

DK-CLARIN Parallel Financial Corpus (da-en)

The SemDaX Corpus