Each unicode character is assigned a category. This is the complete list of categories.
| Code | Description |
|---|---|
| [Cc] | Other, Control |
| [Cf] | Other, Format |
| [Cn] | Other, Not Assigned (no characters in the file have this property) |
| [Co] | Other, Private Use |
| [Cs] | Other, Surrogate |
| [LC] | Letter, Cased |
| [Ll] | Letter, Lowercase |
| [Lm] | Letter, Modifier |
| [Lo] | Letter, Other |
| [Lt] | Letter, Titlecase |
| [Lu] | Letter, Uppercase |
| [Mc] | Mark, Spacing Combining |
| [Me] | Mark, Enclosing |
| [Mn] | Mark, Nonspacing |
| [Nd] | Number, Decimal Digit |
| [Nl] | Number, Letter |
| [No] | Number, Other |
| [Pc] | Punctuation, Connector |
| [Pd] | Punctuation, Dash |
| [Pe] | Punctuation, Close |
| [Pf] | Punctuation, Final quote (may behave like Ps or Pe depending on usage) |
| [Pi] | Punctuation, Initial quote (may behave like Ps or Pe depending on usage) |
| [Po] | Punctuation, Other |
| [Ps] | Punctuation, Open |
| [Sc] | Symbol, Currency |
| [Sk] | Symbol, Modifier |
| [Sm] | Symbol, Math |
| [So] | Symbol, Other |
| [Zl] | Separator, Line |
| [Zp] | Separator, Paragraph |
| [Zs] | Separator, Space |