Skip to content

feat: add support for controlling bundle size #335

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

techfg
Copy link

@techfg techfg commented May 1, 2025

As discussed in #326, other than for astro-expressive-code, there is no current way to control the bundle size when using expressive-code outside of some package.json related workarounds. This PR aims to address that by providing stock support for controlling bundle size by introducing a new Core concept across the various packages (e.g., ExpressiveCodeCore, RehypeExpressiveCodeCore, etc.).

Note

The approach taken in this PR is to ensure 100% backwards compatibility with v0.41.2. All existing tests pass without any modification. If it's acceptable to introduce some minor adjustments, the current approach can be simplified in some areas. Since expressive-code uses semver 0.x.x, technically every release is a breaking change so for the purposes of long-term maintenance, some changes may be worth considering.

TLDR;

0 lang / 0 theme Full Bundle Shiki PluginShiki PluginFrames PluginTextMarkers
v0.41.2 9,575,587 9,693,617 19,415 23,186 22,246
Core No Plugins 338,581 0 0 0 0
Savings 9,237,006 9,693,617 19,415 23,186 22,246
1 lang / 1 theme Full Bundle Shiki PluginShiki PluginFrames PluginTextMarkers
v0.41.2 9,575,587 9,693,617 19,415 23,186 22,246
Core Bundle 772,274 358,594 20,194 0 0
Savings 8,803,313 9,335,023 -779 23,186 22,246
1 lang / 1 theme Full Bundle Shiki PluginShiki PluginFrames PluginTextMarkers
v0.41.2 9,575,587 9,693,617 19,415 23,186 22,246
Core Highlighter 773,154 360,766 18,949 0 0
Savings 8,802,433 9,332,851 466 23,186 22,246
  1. Setting shiki=false in configuration makes no difference to final bundle size when using ExpressiveCode/rehypeExpressiveCode/createRenderer. To reduce bundle size, you must use Core directly.
  2. Shiki & Plugin* columns are renderedLength of corresponding item, the Full Bundle column is the actual final size of the module on disk
  3. The reason PluginFrames and PluginTextMarkers are not zero in Core is primarily due to styles (css) which could likely be refactored out

Additional Information

  1. remark-expressive-code appears to be deprecated so it was not updated for Core
  2. astro-expressive-code needs to be updated to Core. Once a decision is made on potentially incorporating this PR and the Core API finalized, I'll look at updating it to use Core rather than its vite-plugin approach.
  3. I've added a few comments to code in this PR to highlight some things that should be specifically evaluated/considered outside of the general PR itself
  4. The bundle-size tests generate builds across the different variants of RehypeExpressiveCode and RehypeExpressiveCodeCore. In dist folder of each fixture are statistics files that can be reviewed for detailed information about the build output. See visualizer and bundle-stats for more information.

Open to any and all feedback, appreciate your consideration!

Resolves #326

Copy link

netlify bot commented May 1, 2025

Deploy Preview for expressive-code ready!

Name Link
🔨 Latest commit 5ed9bd1
🔍 Latest deploy log https://app.netlify.com/projects/expressive-code/deploys/6869e92263222900084ac720
😎 Deploy Preview https://deploy-preview-335--expressive-code.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@@ -25,7 +15,9 @@ export interface PluginShikiOptions {
*
* @example { 'mjs': 'javascript' }
*/
langAlias?: Record<string, string> | undefined
// TODO: This should be simply L but for backwards compat, need to support string. This should be changed so that you cannot provide
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this is a bug in the current released version which uses Record<string, string>, however the languages (the value of the key) should be constrained to the languages available in the current bundle. Currently, you could write an alias of foo->bar and would not get any typescript warning (it would fail at runtime). Additionally, with string, string, you do not get any intellisense/autocomplete. I've adjusted to use StringLiteralUnion which will solve the intellisense problem but I believe this should just be L in order to provide proper type safety. I made this backwards compatible (will support any string) per the goals of the first draft of this PR but my recommendation is that this should be changed to L. It wouldn't be considered a breaking change, I don't believe, because the current version will fail at runtime anyway.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was intentionally not limited to the bundled languages as users can add their own languages through the langs option, and it should be possible to define aliases for such additional languages as well. But using StringLiteralUnion is a great idea.

Copy link
Author

@techfg techfg May 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see what you did here now, nice :) Sorry, overlooked the way this was handled previously.

Ok, I just committed 58ec5b5 which makes the following adjustments:

  1. Removes the TODO
  2. Adds a note in the langAlias doc regarding name property to make it clear how it must align (name is required on LanguageRegistration but this should help make it clear how langAlias matches up)
  3. Aligned the docs for PluginShikiOptions and PluginShikiBundleOptions regarding langs
  4. Removed langAlias from PluginShikiHighlighterOptions since it never really made sense to support it in the first place since the user is under full control of the highlighter and can specify langAlias directly. If we did choose to support langAlias would could only support L (not StringLiteralUnion<L>) since we don't have a langs property to iterate and load from nor do we "init" (and cache) anything like we do with pluginShiki/plugShikiBundle.
  5. Added tests regarding custom languages and aliases since I didn't see any coverage for this scenario in the existing tests.

async function createRegexEngine(engine: PluginShikiOptions['engine']) {
// The [...engine...][0] syntax makes it easier to find this code in the built package,
// allowing astro-expressive-code to remove unused engines from the SSR bundle
// TODO: This could be adjusted to use direct imports if/when astro-expressive-code is updated
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If astro-expressive-code is updated to use Core, these can become static imports vs. dynamic

baseStyles: string
themeStyles: string
jsModules: string[]
}

export type RehypeExpressiveCodeCoreRenderer = RehypeExpressiveCodeEngineRenderer<ExpressiveCodeCore>
export type RehypeExpressiveCodeRenderer = RehypeExpressiveCodeEngineRenderer<ExpressiveCode>
Copy link
Author

@techfg techfg May 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These wrapper types only exist for backwards compat. Ultimately, ec in the base type is the instance of ExpressiveCode/ExpressiveCodeCore that was created but from a typing perspective, I believe it makes more sense to just have ec: ExpressiveCodeEngine vs the actual instance types. If I'm understanding the code, ExpressiveCode class is just a tiny layer on top of the engine. If ExpressiveCode/ExpressiveCodeCore classes are thought of as a bootstrap for the engine, they shouldn't ever have any specific API so exposing the engine type directly would simplify the typings here and in a couple of other places.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Further thoughts on this at #335 (comment)


type ShikiDetails = Awaited<ReturnType<typeof buildFixture>>

describe('integration with shiki affect on ec bundle size', () => {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test suite adds ~25seconds to pnpm test. I think there is a lot of value in having all these tests to inspect/review various configurations but for normal CI runs, possibly narrow this down to a subset of these across the 4 variation groups. Each individual test takes ~1.5sec.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's awesome that you've added these tests and agree that there is a lot of value in having them.

In case we need to limit the tests performed during normal runs to prevent slowing down development too much, which subset would you recommend to always run?

Copy link
Author

@techfg techfg May 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The more that I've thought about this, I think for CI, leaving all the tests makes sense - at least for the main branch workflow, possibly pr branches can have a subset.

Where this additional ~25secs being a challenge is in local development, possibly that is what you meant by "slowing down development"? I think this question really comes down to what the normal workflow is for a developer when testing EC. If the approach is using pnpm test from the root, then this additional time isn't ideal.

I think at minimum, the tests with no-themes.ts would be a good baseline across all 4 test categories within the suite. This makes sure that a vanilla configuration with no additional themes specified works as expected re: bundle size for ec, ec-core, ec-core-bundle and ec-core-highlighter. This could be easily accomplished by adding a parameter to TestCase and then conditionally using test.skip within the test based on its value.

@hippotastic
Copy link
Collaborator

This is seriously impressive work! Thank you for diving so deep into this, finding clever solutions, documenting your decisions, adding tests - just wow. I would have never expected this amount of effort being put into limiting the bundle size.

I've just returned from holidays and will have a look at this in more detail soon. Based on how substantial these changes are, going through them could take a while, but I'll make sure to make room for it in my schedule. Thanks again!

@techfg
Copy link
Author

techfg commented May 5, 2025

This is seriously impressive work! Thank you for diving so deep into this, finding clever solutions, documenting your decisions, adding tests - just wow. I would have never expected this amount of effort being put into limiting the bundle size.

I've just returned from holidays and will have a look at this in more detail soon. Based on how substantial these changes are, going through them could take a while, but I'll make sure to make room for it in my schedule. Thanks again!

Welcome back @hippotastic, hope you enjoyed the time away!

Appreciate your kind words, looking forward to hearing your further thoughts once you've had a chance to fully digest. In meantime, I'll start reviewing your feedback thus far and comment back and/or update PR accordingly.

@techfg
Copy link
Author

techfg commented May 5, 2025

@hippotastic -

I was able to figure out why plugin-frames and plugin-text-markers had a small footprint with the initial proposed Core layer introduced in this PR. In short, both of these plugins, along with expressive-code itself have side-effects related to the EC styling system. For example, pluginFrameTexts prohibits the bundler from tree-shaking even when plugin-frames is not used since the import { pluginFrames } from '@expressive-code/plugin-frames is in the entry point module itself. Something to consider to eliminate this would be to change the way the styling system works, possibly a registration API vs. the current approach.

Given the goal of this PR is to be 100% backwards compatible, in order to eliminate plugin-frames and plugin-text-markers "leaking" in to Core, I had to separate the Core related entry points into their own module. Previously, all the Core related items were in the existing index entry point across @expressive-code/plugin-shiki, expressive-code and rehype-expressive-code. I just added ef112ae which is 100% functionality equivalent to the prior version of this PR but makes the following adjustments:

  1. Moves Core related entries to @expressive-code/plugin-shiki/core, expressive-code/core & rehype-expressive-code/core
  2. Updates the bundle-size.test.ts to include tests around plugin-frames and plugin-text-markers for all 4 categories of tests within the suite

A few things:

  1. I've added a few more comments to this PR for you to review based on things I found while working on ef112ae
  2. The 'net' savings is only ~7k but it just didn't seem right to me to have plugin-frames & plugin-text-markers in the bundle when they were not being used at all. This commit closes that gap. I also think that having the core entry points is more consistent with the Core concept that this PR is trying to introduce.
  3. I've updated the description of the PR to reflect the final numbers and other information with this commit now included but here's a snapshot of how the numbers changed with ef112ae - note that the actual v0.41.2 numbers themselves have changed from the original OP as I backported the tests to v0.41.2 to get a 100% fully accurate comparison (previous v0.41.2 numbers were drawn from this PR vs. what's actually in main):
0 lang / 0 theme Full Bundle Shiki PluginShiki PluginFrames PluginTextMarkers
v0.41.2 9,577,922 9,575,587 9,695,299 9,693,617 20,365 19,415 23,186 22,246
Core No Plugins 347,138 338,581 0 0 3,788 0 2,367 0
Savings 9,230,784 9,237,006 9,695,299 9,693,617 20,365 19,415 19,398 23,186 19,879 22,246

import type { PluginFramesOptions } from '@expressive-code/plugin-frames'
import { pluginFrames } from '@expressive-code/plugin-frames'
import type { PluginShikiOptions } from '@expressive-code/plugin-shiki'
import { pluginShiki } from '@expressive-code/plugin-shiki'
import { pluginTextMarkers } from '@expressive-code/plugin-text-markers'

// NOTE - tsup code splitting is disabled to avoid re-exports being dropped
// see https://github.com/evanw/esbuild/issues/1737 & https://github.com/evanw/esbuild/issues/1521
// TODO: Consider a different bundler in order to support code-splitting to reduce /dist size (e.g., unbuild)
Copy link
Author

@techfg techfg May 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had to disable splitting in expressive-code due to limitations with tsup/esbuild and re-exports. I noticed that you also have it disabled in @expressive-code/core, likely for the same reason.

This isn't ideal but also not the end of the world since the total size of dist doesn't increase all that much but possibly it may be worth considering a bundler that has better support for re-exports such as unbuild which would, more or less, be a drop-in replacement for tsup.

@@ -15,14 +15,24 @@
"target": "ESNext",
"paths": {
"@expressive-code/core": ["./packages/@expressive-code/core/src"],
"@expressive-code/core/hast": ["./packages/@expressive-code/core/src/hast"],
Copy link
Author

@techfg techfg May 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After adding core entry points to the three (3) packages, I started getting typescript warnings/errors due to unresolved packages. I only needed to add entries here for the new core modules but noticed that several others were missing so added those in as well. If their absence was intentional, let me know and I can remove all but the new core ones.

Also, along a similar line and while completely unrelated to this PR, I noticed that this line has a typescript error File '/home/barry/repos/expressive-code/packages/rehype-expressive-code/test/utils.ts' is not under 'rootDir' '/home/barry/repos/expressive-code/packages/astro-expressive-code'. 'rootDir' is expected to contain all source files.ts(6059). There are other examples of this error in astro-expressive-code. Is there a reason you explicitly set rootDir and not in tsconfig.base.json?

const jsModules = await ec.getJsModules()

return {
ec,
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As alluded to in #335 (comment), the more I've thought about this, I think that ec here should be typed ExpressiveCodeEngine. If we make this change, then we do not need the ctor option that gets passed into this function as we can always use ExpressiveCodeCore to create ec even when we're coming from an ExpressiveCode path. If we make that change, then we could also eliminate the loadTheme option and solely rely on customLoadTheme. To do that, in createRenderer, we would adjust to set customLoadTheme: options.customLoadTheme ?? loadShikiTheme.

If we make both of those changes, we can significantly reduce the complexity of the types and "lift" everything in common.ts directly in to core.ts.

The only downsides to make these two changes are:

  1. It would not be 100% backwards compatible with v0.41.2. However, other than the typings themselves, at runtime, the only place I could see this causing any issues is if someone was doing ec instanceof ExpressiveCode, a situation that is highly unlikely to be done by userland.
  2. We would be changing the actual options passed in by the user if they didn't provide a customLoadTheme so any downstream usage of options would now have a customLoadTheme when it originally didn't. I do not think this is a problem though since if they are heading down the createRenderer path anyway, the only downstream usage of options is going to be stock functionality anyway.

Copy link
Author

@techfg techfg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hippotastic -

Found another opportunity to reduce bundle size as it relates to the default themes that are currently included in ExpressiveCodeEngine.

In short, when themes are provided in the config, the github-dark and github-light themes are still included in the bundle even though they will never be used. This unnecessarily increases bundle size by 22,472 bytes.

To eliminate the increase when themes are provided, there are three options:

  1. Eliminate these defaults from ExpressiveCodeEngine and move them to a higher level of abstraction (e.g., ExpressiveCode itself) and throw an error inside of ExpressiveCodeEngine if no themes have been specified (it would fail a little father down in the code path anyway so an error gives something meaningful vs. an unhandled exception). The only downside to this approach that I can see is that if anyone in userland is creating an instance of ExpressiveCodeEngine directly, this would be a breaking change. My guess is it's unlikely anyone is creating a direct instance of the engine, however, and even in that case, it would be a simple adjustment to their code to include their desired themes.
  2. Create a new ExpressiveCodeEngineCore and expose via a different entry point. In short, ExpressiveCodeEngine would be a subclass of ExpressiveCodeEngineCore and layer in the default themes before calling super. This approach would only be required to ensure 100% backwards compat but introduces an additional layer of complexity to the code base. I've created an example of what this would look like. Note that the name @expressive-code/core/core for the entry point likely needs a different name (e.g., possibly base).
  3. A variant of option 2, this would instead be to create a different entry point (e.g., full) that contains everything in @expressive-code/core that isn't actually Core - for now, this would just be ExpressiveCodeEngine. Given the name core is already in the package name and Core is implied, this make a little more sense than option 2 but it does mean ExpressiveCodeEngine would no longer be exported via @expressive-code/core and instead @expressive-code/core/full which would be a breaking change for anyone consuming ExpressiveCodeEngine directly.

At the end of the day, 22,472 bytes isn't the end of the world but it's still unnecessary in cases where userland wants full control of the bundle size by utilizing the Core layer. I think option 1 is the most straightforward with only one likely minor drawback. That said, option 2 & 3 provide some additional opportunities for potentially even further reducing bundle size for things that also exist in @expressive-core/core that may not be fully required for EC to function properly. I haven't gone down the entire path yet of what is/isn't needed for baseline functionality, but this could further refine @expressive-code/core to truly be a Core layer and only contain the bare minimum.

Lemme know your thoughts, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

feat: support providing shiki highlighter as an option to enable fine-grained bundle support beyond astro
2 participants