From: Comparison of named entity recognition methodologies in biomedical documents
| Feature | Description |
|---|---|
| Unigram | \( {\text{w}}_{{{\text{i}} - 2}} ,{\text{w}}_{{{\text{i}} - 1}} ,{\text{w}}_{\text{i}} ,{\text{w}}_{{{\text{i}} + 1}} ,{\text{w}}_{{{\text{i}} + 2}} \) |
| Bigram | \( {\text{w}}_{{{\text{i}} - 2}} \left| {{\text{w}}_{{{\text{i}} - 1}} ,{\text{w}}_{{{\text{i}} - 1}} } \right|{\text{w}}_{\text{i}} ,{\text{w}}_{\text{i}} \left| {{\text{w}}_{{{\text{i}} + 1}} ,{\text{w}}_{{{\text{i}} + 1}} } \right|{\text{w}}_{{{\text{i}} + 2}} \) |
| Trigram | \( {\text{w}}_{{{\text{i}} - 2}} \left| {{\text{w}}_{{{\text{i}} - 1}} } \right|{\text{w}}_{\text{i}} ,{\text{w}}_{{{\text{i}} - 1}} \left| {{\text{w}}_{\text{i}} } \right|{\text{w}}_{{{\text{i}} + 1}} ,{\text{w}}_{\text{i}} \left| {{\text{w}}_{{{\text{i}} + 1}} } \right|{\text{w}}_{{{\text{i}} + 2}} \) |