Versions

The latest version of the corpus Deep-Sequoia is version 8.2 release in July 2017 (in parallel of UD 2.2 version).

Version Date UD-version Description Grew-match
8.3 2018.11.19 2.3 Fix some tokenisations Not available
8.2 2018.03.16 2.2 Fix errors and improve annotation consistency Not available
8.1 2017.10.21 2.1 Fix errors and improve annotation consistency surf and deep&surf
8.0 2017.03.13 2.0 Fix errors, change encoding of fixed expressions (see below) surf and deep&surf
7.0 2015.11.13 Fix errors by systematic search of inconsistency in annotation. surf and deep&surf
1.1 2014.06.05 Fix some lemmas, fix 3 sentences with multiple surface roots
1.0 2014.05.29 First release (with deep relations)

Notes

Version numbers

In the 2015 release, the version number was set to 7.0 to be align with previous release of the Sequoia corpus (before introduction of deep-dependencies).

Development version

The current version is available as the master branch on the Gitlab project. The same version is available in Grew-match: surf and deep&surf.

UD Versions

Since 2017, the Sequoia corpus (surface only) is also available in Universal Dependency format and is released as one of the UD corpora named UD_French-Sequoia.

Encoding of fixed expressions

In version 7.0 and previous, fixed expressions are encoded in a single token with _ symbol as a word separator. Since version 8.0 these expressions are represented by several tokens linked with dep_cpd relations. For instance, in the two figures below, the sentence annodis.er_00106 is given with its annotation in Sequoia 7.0 and 8.0.

example 7.0

example 8.0