Skip to content

Move numbering kinds to Codex #116

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 12 commits into
base: main
Choose a base branch
from
Draft

Conversation

MDLC01
Copy link
Collaborator

@MDLC01 MDLC01 commented Aug 7, 2025

Following typst/typst#6379 (comment), this PR adds numbering kinds to Codex (renamed to "numeral system" because this is the more appropriate name I believe). This is an initial implementation that may have multiple issues. The goal is to initiate the discussion on this topic.

The current implementation depends on ecow, which means the version used here may have to be somewhat synchronized with the one used in Typst. I'm not sure exactly how Cargo determines what version to use, so maybe this is a non-issue.

For now, I did not include the single-character representation of numeral systems as part of this. Should it be in Codex or only in Typst?

For the long names, I chose to use a symbol-like syntax for now (e.g., chinese.simplified, Chinese.simplified, chinese.traditional, and Chinese.traditional). This is of course up to debate, and one of the main open question for now. In particular, we might want to align ourselves with CSS Counter style names as mentioned on Discord 1

For now, I only included the formats that were already present in Typst.2

Finally, I realized this could be the occasion to broaden the crate description in Cargo.toml. On Discord, @laurmaedje mentioned that "we could expand codex from just naming symbols to more generally supporting Unicode- and internationalization-related efforts in the Typst ecosystem".1 Moreover, with Codex becoming much more than assigning symbols names, we might want to add a new Cargo feature for symbol names like we now have for styling and numeral systems.

Footnotes

  1. https://discord.com/channels/1054443721975922748/1277628305142452306/1380114491638943796 2

  2. https://github.com/typst/typst/blob/bcc71ddb9b5146b97cb3078ecdb5406f31781ea2/crates/typst-library/src/model/numbering.rs

@MDLC01
Copy link
Collaborator Author

MDLC01 commented Aug 8, 2025

Other question: some numeral systems simply do not have a way to represent some numbers (e.g., 0 in Roman numerals, or numbers greater than 50 in circled Arabic numerals). Should we change NumeralSystem::apply's return type to Option<EcoString> and let the user of Codex handle those cases?

@MDLC01
Copy link
Collaborator Author

MDLC01 commented Aug 8, 2025

This is clearly not something that is within the scope of this PR, but numeral systems could be merged with the concept of number formats, making apply accept an f64 and return some string when the numeral system is able to express the number.

@MDLC01 MDLC01 added the meta Discussion about the structure of this repo label Aug 8, 2025
Copy link
Collaborator

@T0mstone T0mstone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of depending on ecow, I think implementing Display on a custom type would be better. That way it can still be used in eco_format! to get an EcoString, but codex remains sort of independent of that and you could also use this functionality with regular strings.

I'm thinking this could be done by replacing the private helper functions f(spec, n) with two types pub struct FormatNumeral(NumeralSystemSpec, pub u64); pub enum NumeralSystemSpec { ...f(spec) } and changing NumeralSystem::apply to the signature (self, u64) -> FormatNumeral. Or alternatively, the struct could also be private and apply would return impl Display.

@MDLC01
Copy link
Collaborator Author

MDLC01 commented Aug 11, 2025

I removed the dependency on ecow by using Display as you suggested. I did so without a NumeralSystemSpec enum because this felt unnecessary, especially because we do not expose the formatting functions.

I'm working on preventing allocations in positional and bijective.

@T0mstone
Copy link
Collaborator

T0mstone commented Aug 11, 2025

I'm working on preventing allocations in positional and bijective.

std uses an array and writes into it in chunks starting from the end. This would need a bit of unsafe or a dependency on something like arrayvec, so maybe the vec is not so bad after all. (Edit: You don't even need MaybeUninit for this, just initialize an array of chars with '\0', so this can just be safe for a probably quite small performance cost.)
When keeping the Vec, I'd definitely use Vec::with_capacity(n.ilog2() / radix.ilog2() + 1) to make sure there are no reallocs.

@MDLC01
Copy link
Collaborator Author

MDLC01 commented Aug 11, 2025

I found a way to prevent allocation for non-bijective positional systems, but since I'm not at all used to reasoning with bijective positional systems, I'm not really sure how to to that for those, so I was also thinking of using with_capacity instead to at least make sure we limit the number of allocations to one.

@MDLC01
Copy link
Collaborator Author

MDLC01 commented Aug 11, 2025

The problem with using an array is that it has to be of a size known at compile time, meaning we have to assume the worst case, with is u64::MAX with a radix of 2. And this is clearly too much memory.

@T0mstone
Copy link
Collaborator

we have to assume the worse case

not necessarily, you could check against a pre-defined list of constant radixes.

@MDLC01
Copy link
Collaborator Author

MDLC01 commented Aug 11, 2025

I think allocating a vector with Vec::with_capacity is better. If we want to prevent allocations, let's do it using an algorithm that works all the time.

@MDLC01
Copy link
Collaborator Author

MDLC01 commented Aug 11, 2025

I did not add tests for every numeral system because I would like to go over them in separate PRs to make sure that they are all correctly implemented.

Co-authored-by: T0mstone <39707032+t0mstone@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
meta Discussion about the structure of this repo
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants