Versions

Releases

Version Date UD-version Description
9.2 2020.10.05 [2.7] Few corrections of lemmas and FRSEMCOR annotations
9.1 2020.06.10 2.6 FRSEMCOR annotation
9.0 2019.05.17 2.4 MWE & NE anntations
8.3 2018.11.19 2.3 Fix some tokenisations
8.2 2018.03.16 2.2 Fix errors and improve annotation consistency
8.1 2017.10.21 2.1 Fix errors and improve annotation consistency
8.0 2017.03.13 2.0 Fix errors, change encoding of fixed expressions (see below)
7.0 2015.11.13 Fix errors by systematic search of inconsistencies in annotation.
1.1 2014.06.05 Fix some lemmas, fix 3 sentences with multiple surface roots
1.0 2014.05.29 First release with deep relations (aligned with Sequioa 6.0)

Development version

The current version is available as the master branch on the Gitlab project and the 6 versions of the corpus described in process page are available on Grew_match:

Notes

Earlier versions

Earlier version of the Sequoia corpus (before deep syntax annotation) are described here and are available for download here.

Version numbers

In the 2015 release, the version number was set to 7.0 to be aligned with previous releases of the Sequoia corpus (before introduction of deep-dependencies).

UD Versions

Since 2017, the Sequoia corpus (surface only) is also available in Universal Dependency format and is released as one of the UD corpora named UD_French-Sequoia.

Encoding of fixed expressions

In version 7.0 and previous, fixed expressions are encoded in a single token with _ symbol as a word separator. Since version 8.0 these expressions are represented by several tokens linked with dep_cpd relations. For instance, in the two figures below, the sentence annodis.er_00106 is given with its annotation in Sequoia 7.0 and 8.0.

example 7.0

example 8.0