Development version

The current version is available as the master branch on the Gitlab project. Corresponding corpora are available in Grew-match: surfdeep_and_surfsurf.parsemedeep_and_surf.parseme.


Version Date UD-version Description Grew-match
9.0 2019.05.17 2.4 MWE & NE anntations surfdeep_and_surfsurf.parsemedeep_and_surf.parseme
8.3 2018.11.19 2.3 Fix some tokenisations surfdeep_and_surf
8.2 2018.03.16 2.2 Fix errors and improve annotation consistency surfdeep_and_surf
8.1 2017.10.21 2.1 Fix errors and improve annotation consistency surfdeep_and_surf
8.0 2017.03.13 2.0 Fix errors, change encoding of fixed expressions (see below) surfdeep_and_surf
7.0 2015.11.13 Fix errors by systematic search of inconsistency in annotation. surfdeep_and_surf
1.1 2014.06.05 Fix some lemmas, fix 3 sentences with multiple surface roots
1.0 2014.05.29 First release with deep relations (aligned with Sequioa 6.0)


Earlier versions

Earlier version of the Sequoia corpus (before deep syntax annotation) are described here and are available for download here.

Version numbers

In the 2015 release, the version number was set to 7.0 to be align with previous release of the Sequoia corpus (before introduction of deep-dependencies).

UD Versions

Since 2017, the Sequoia corpus (surface only) is also available in Universal Dependency format and is released as one of the UD corpora named UD_French-Sequoia.

Encoding of fixed expressions

In version 7.0 and previous, fixed expressions are encoded in a single token with _ symbol as a word separator. Since version 8.0 these expressions are represented by several tokens linked with dep_cpd relations. For instance, in the two figures below, the sentence annodis.er_00106 is given with its annotation in Sequoia 7.0 and 8.0.

example 7.0

example 8.0