Update process of the Sequoia resources
Several formats are available they are all built from a the single resource which contains all layers sequoia.deep_and_surf.parseme.frsemcor
.
The figure below describes the different transformations that are applied to produce other formats.
All files are available from the Gitlab project.
Formats depend on:
- presence of PARSEME-FR and FRSEMCOR annotations:
- with such annotations: suffix
parseme.frsemcor
and orange background - without: suffix
conll
(orconllu
) and green background
- with such annotations: suffix
- Deep syntactic annotation:
- with both surface and deep dependencies: basename is
sequoia.deep_and_surf
- with only surface dependencies: basename is
sequoia.surf
- with both surface and deep dependencies: basename is
- Conversion to Universal dependencies
- Native dependencies: basename is
sequoia
- Universal dependencies conversion: basename is
sequoia-ud
- Native dependencies: basename is
The Graph Rewriting Systems (usable with Grew) are available:
sequoia_proj.grs
: in the Gitlab project (foldertools
)ssq_to_ud/main.grs
: in the Gitlab project (foldergrs
)
Note about universal dependencies
In order to produce the final files for the UD project, two more steps are needed (not detailed here)
- Adding the misc feature
SpaceAfter=No
when needed (using udapi tool) - Splitting into three subfiles
dev
,test
andtrain