Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hardcoded React UMD global #10

Open
e9x opened this issue May 11, 2023 · 15 comments
Open

hardcoded React UMD global #10

e9x opened this issue May 11, 2023 · 15 comments
Labels
enhancement New feature or request jsx webpack

Comments

@e9x
Copy link

e9x commented May 11, 2023

The current Sketchy implementation only decompiles React JSX when the code utilizes the UMD global, which is not effective since the majority of React websites incorporate the library within their bundle.

To make the decompilation process more effective and adaptable to different React websites, I recommend a more dynamic approach by identifying the React library being used in the compiled code, instead of hardcoding the use of 'React'. This can possibly be achieved by finding the variable name assigned to the React library and using that in the matchers.

constMemberExpression(m.identifier('React'), 'createElement'),

constMemberExpression(m.identifier('React'), 'createElement'),

constMemberExpression(m.identifier('React'), 'Fragment'),

@e9x e9x changed the title hardcoded React UMD global✏️✏️ hardcoded React UMD global May 11, 2023
@j4k0xb
Copy link
Owner

j4k0xb commented May 12, 2023

instead of hardcoding the use of 'React'. This can possibly be achieved by finding the variable name assigned to the React library

Thats a possibility, but I have to check the hardcoded name either way because of UMD.

Theres another feature request that would solve this issue without with much extra work (edit: nvm its harder than expected):

one file got renamed to utils.js and const a = require("./utils.js") could be renamed to const utils = require("./utils.js")

So first identify the react require/import variable and rename it to React, then the jsx transforms can automatically find it.

@j4k0xb j4k0xb added the enhancement New feature or request label May 12, 2023
@e9x
Copy link
Author

e9x commented May 12, 2023

What if the renaming of assignments to require() calls was extended to functions, classes, variables, properties, arguments, and everything else?

See example

That way the UMD global matchers can be replaced with something like require_React.createElement. The old UMD global matcher can be kept too.

This goes beyond the scope of this issue, but it would be a massive improvement.

@e9x
Copy link
Author

e9x commented May 13, 2023

I published my decompiler that I used in the above example. I think it might be a good reference for adding this feature.
https://github.com/e9x/krunker-decompiler

@0xdevalias
Copy link
Contributor

I'm not sure of the exact way they go about resolving this, but I came across another tool today that seemed to handle embedded React pretty well:

Digging through the code a little for stuff related to React/JSX lead me to this:

@0xdevalias
Copy link
Contributor

Came across this issue again while testing the new v2.11.0 web IDE update today.

Another tool originally struggled with this too (Ref)

You can get the minimised code that I am testing against here (Ref)

Loading that in the webcrack web IDE (Ref) with the following config:

image

You can see the issues related to this in files like 180.js, where the JSX hasn't been unminimised:

180.js

require.d(exports, {
  Z: function () {
    return a;
  }
});
var r = require( /*webcrack:missing*/"./35250.js");
function a(e) {
  var t;
  var n = e.url;
  var a = e.size;
  var i = a === undefined ? 16 : a;
  var s = e.className;
  try {
    t = new URL(n);
  } catch (e) {
    console.error(e);
    return null;
  }
  return (0, r.jsx)("img", {
    src: `https://icons.duckduckgo.com/ip3/${t.hostname}.ico`,
    alt: "Favicon",
    width: i,
    height: i,
    className: s
  });
}

@0xdevalias
Copy link
Contributor

0xdevalias commented Dec 20, 2023

Here's a wip version that converts all top level requires and this export variation: https://deploy-preview-31--webcrack.netlify.app/

Originally posted by @j4k0xb in #30 (comment)

That WIP (see #31) seems to have fixed the above r.jsx, which now gets converted back into JSX properly (as you showed in #30 (comment))

Looking at that same original source file (Ref), in 63390.js, there are these jsxs imports/usages, that aren't being unminified back to JSX (though amusingly, the children part is):

// 63390.js, line 5
import { jsxs, Fragment, jsx } from /*webcrack:missing*/"./35250.js";
// 63390.js, lines 151-157
return jsxs(r ? "a" : "div", {
    className: _Z("flex h-full w-full flex-col overflow-hidden rounded-md border border-black/10 bg-gray-50 shadow-[0_2px_24px_rgba(0,0,0,0.05)]", s),
    href: r,
    target: r ? "_blank" : "",
    onClick: h,
    children: [c && <H><div className="absolute inset-0"><img src={a} alt={`image of ${n}`} className="h-full w-full border-b border-black/10 object-cover" /></div></H>, <div className="flex flex-1 flex-col justify-between gap-1.5 p-3"><_Component65 $clamp={u !== undefined && u || c}>{n}</_Component65><div className="flex items-center gap-1">{i ? <_Z5 url={i} name={t} size={13} /> : <_Z6 url={r} size={13} />}<div className="text-[10px] leading-3 text-gray-500 line-clamp-1">{t}</div></div></div>]
  });

Contrasting this against wakaru's output (which also used to have issues with jsxs (Ref), with some notes about why it wasn't working in that case here (Ref), and still seems to have some struggles with the jsxs(url ? "a" : "div", { part even now (Ref))

Details

Source (unpacked)

// module-63390.js, lines 186-225
  return (0, o.jsxs)(r ? "a" : "div", {
    className: (0, l.Z)(
      "flex h-full w-full flex-col overflow-hidden rounded-md border border-black/10 bg-gray-50 shadow-[0_2px_24px_rgba(0,0,0,0.05)]",
      s
    ),
    href: r,
    target: r ? "_blank" : "",
    onClick: h,
    children: [
      c &&
        (0, o.jsx)(H, {
          children: (0, o.jsx)("div", {
            className: "absolute inset-0",
            children: (0, o.jsx)("img", {
              src: a,
              alt: "image of ".concat(n),
              className: "h-full w-full border-b border-black/10 object-cover",
            }),
          }),
        }),
      (0, o.jsxs)("div", {
        className: "flex flex-1 flex-col justify-between gap-1.5 p-3",
        children: [
          (0, o.jsx)(z, { $clamp: (void 0 !== u && u) || c, children: n }),
          (0, o.jsxs)("div", {
            className: "flex items-center gap-1",
            children: [
              i
                ? (0, o.jsx)(R.Z, { url: i, name: t, size: 13 })
                : (0, o.jsx)(U.Z, { url: r, size: 13 }),
              (0, o.jsx)("div", {
                className: "text-[10px] leading-3 text-gray-500 line-clamp-1",
                children: t,
              }),
            ],
          }),
        ],
      }),
    ],
  });

Transformed (unminified)

// module-63390.js, lines 213-252
return jsxs(url ? "a" : "div", {
    className: Z$0(
      "flex h-full w-full flex-col overflow-hidden rounded-md border border-black/10 bg-gray-50 shadow-[0_2px_24px_rgba(0,0,0,0.05)]",
      className
    ),
    href: url,
    target: url ? "_blank" : "",
    onClick: h,
    children: [
      c && (
        <H>
          {
            <div className="absolute inset-0">
              {
                <img
                  src={imageUrl}
                  alt={`image of ${title}`}
                  className="h-full w-full border-b border-black/10 object-cover"
                />
              }
            </div>
          }
        </H>
      ),
      <div className="flex flex-1 flex-col justify-between gap-1.5 p-3">
        <Z$2 $clamp={(mini !== undefined && mini) || c}>{title}</Z$2>
        <div className="flex items-center gap-1">
          {logoUrl ? (
            <R.Z url={logoUrl} name={t} size={13} />
          ) : (
            <U.Z url={url} size={13} />
          )}
          <div className="text-[10px] leading-3 text-gray-500 line-clamp-1">
            {t}
          </div>
        </div>
      </div>,
    ],
  });
}

@j4k0xb
Copy link
Owner

j4k0xb commented Dec 20, 2023

The code probably looked like this and Tag got inlined later:

const Tag = r ? "a" : "div";
return <Tag />;

Should be enough to extract it to a variable again

@0xdevalias
Copy link
Contributor

The code probably looked like this and Tag got inlined later

@j4k0xb Yeah, that was my (or really.. ChatGPT's) conclusion as well :) (Ref)

@j4k0xb
Copy link
Owner

j4k0xb commented Dec 22, 2023

That jsx type should work now: #38 / https://deploy-preview-38--webcrack.netlify.app/ (the branch doesn't have the esm changes yet)
I think it doesn't apply to nested elements or React.createElement because minifiers avoid changing the evaluation order.

@0xdevalias
Copy link
Contributor

That jsx type should work now: #38 / deploy-preview-38--webcrack.netlify.app (the branch doesn't have the esm changes yet)

@j4k0xb I'm guessing that because that branch doesn't have the ESM changes, that's why it isn't getting unminimised at all, even when I use that branch you just mentioned?

@j4k0xb
Copy link
Owner

j4k0xb commented Dec 24, 2023

Yes.. It detects calls like jsxs(r ? "a" : "div")
The ESM changes involve converting the require to import { jsx } from '...' and then converting (0, jsxs)() to jsxs() which is safe (#6)
I have now rebased the branch so it applies both: https://deploy-preview-31--webcrack.netlify.app/

@0xdevalias
Copy link
Contributor

0xdevalias commented Dec 24, 2023

I have now rebased the branch so it applies both: deploy-preview-31--webcrack.netlify.app

Looking at that same original source file (Ref), unminified, in 63390.js:

// 63390.js, lines 151-152
  const _Component68 = r ? "a" : "div";
  return <_Component68 className={_Z("flex h-full w-full flex-col overflow-hidden rounded-md border border-black/10 bg-gray-50 shadow-[0_2px_24px_rgba(0,0,0,0.05)]", s)} href={r} target={r ? "_blank" : ""} onClick={h}>{c && <H><div className="absolute inset-0"><img src={a} alt={`image of ${n}`} className="h-full w-full border-b border-black/10 object-cover" /></div></H>}<div className="flex flex-1 flex-col justify-between gap-1.5 p-3"><_Component67 $clamp={u !== undefined && u || c}>{n}</_Component67><div className="flex items-center gap-1">{i ? <_Z5 url={i} name={t} size={13} /> : <_Z6 url={r} size={13} />}<div className="text-[10px] leading-3 text-gray-500 line-clamp-1">{t}</div></div></div></_Component68>;

@j4k0xb Looks good, thanks! 🎉

@0xdevalias
Copy link
Contributor

0xdevalias commented Mar 2, 2025

Some additional React issues I noticed today:

The files above all seem to get unpacked now 🎉, though seems there are still other issues with them (eg. React / JSX not being decompiled, etc), eg.

After unpacking, in 484.js, one example being around line 621.

Interestingly, while wakaru can't unpack the original file as-is, if I take 484.js and run it through the online IDE, it seems to do slightly better at de-compiling the React / JSX / etc (though it still has many issues of it's own):

Potentially related to this:

Originally posted by @0xdevalias in #144 (comment)

And some notes exploring it a little deeper:

Looking at the code, on line 2 we have:

var t = window.React;

Which is then later used a bunch of times, usually like this:

/* ..snip.. */
/* 510      */       return t.createElement(z, null, l);
/* ..snip.. */
/* 516      */   } else {
/* 517      */     return t.createElement(F, null, t.createElement(L, {
/* 518      */       ...n
/* 519      */     }), i !== \"loading\" && t.createElement(G, null, i === \"error\" ? t.createElement(N, {
/* 520      */       ...n
/* 521      */     }) : t.createElement(j, {
/* 522      */       ...n
/* 523      */     })));
/* 524      */   }
/* 525      */ };
/* ..snip.. */
/* 549      */ var U = t.memo(({
/* 551      */   toast: e,
/* 552      */   position: l,
/* 553      */   style: i,
/* 554      */   children: n
/* 555      */ }) => {
/* ..snip.. */
/* 580      */ }) : t.createElement(t.Fragment, null, o, c));
/* ..snip.. */
/* 638      */      let [l, i] = (0, t.useState)(C);
/* 639      */      (0, t.useEffect)(() => {
/* ..snip.. */
/* 689      */    let n = (0, t.useCallback)(() => {
/* ..snip.. */
/* 718      */   return t.createElement("div", {
/* ..snip.. */

Originally posted by @0xdevalias in pionxzh/wakaru#36 (comment)

And a few more:

Also, looking a bit deeper, on line 1 we have:

const e = window.wp.element;

Which relates to:

So at least some of the JSX that isn't getting properly de-compiled seems to be related to that:

Image

Originally posted by @0xdevalias in #36

@j4k0xb
Copy link
Owner

j4k0xb commented Mar 2, 2025

Small update:

expectJS('(0, r.jsx)("div", {})').toMatchInlineSnapshot('<div />;'));

180.js:

return <img src={`https://icons.duckduckgo.com/ip3/${t.hostname}.ico`} alt="Favicon" width={i} height={i} className={s} />;

Detects these calls no matter where r comes from now.
Still leaving this issue open until something similar can be figured out for createElement that won't break other code as easily.

@0xdevalias
Copy link
Contributor

0xdevalias commented Mar 3, 2025

Still leaving this issue open until something similar can be figured out for createElement that won't break other code as easily.

While it's not a purely generic solution.. From the original issue description, is there a reason we can't detect this sort of thing:

I recommend a more dynamic approach by identifying the React library being used in the compiled code, instead of hardcoding the use of 'React'. This can possibly be achieved by finding the variable name assigned to the React library and using that in the matchers.

With common well-known React globals on window?

var e = window.wp.element;
var t = window.React;

And then have that propagate the 'this is react' aspect through to whatever downstream matchers rely on that so it can be decompiled properly? As that would seemingly solve #10 (comment), and I would imagine probably shouldn't have many flow on effects that would "break other code as easily"? (unless i'm missing something?)


Edit: Looking deeper at the code.. this is where the jsx / jsxNew plugins are run within the main webcrack function; with jsx being run before jsxNew:

import jsx from './transforms/jsx';
import jsxNew from './transforms/jsx-new';

export async function webcrack(
code: string,
options: Options = {},
): Promise<WebcrackResult> {

(options.deobfuscate || options.jsx) &&
(() => {
applyTransforms(
ast,
[
// Have to run this after unminify to properly detect it
options.deobfuscate ? [selfDefending, debugProtection] : [],
options.jsx ? [jsx, jsxNew] : [],
].flat(),
);
}),

Based on that, I think updates would need to be made to both jsx and jsxNew.


Edit 2: Edit of the edit note: This was originally a deep dive exploration into window.wp.element and window.React; but then I figured out that they are more related to specific functionality in this instance, and not so relevant to the JSX stuff in this issue itself, so I created a new issue for that (but have kept the original in the detail block below for posterity):

Original DeepDive related to window.wp.element / window.React

Looking a bit deeper, I think window.wp.element relates more specifically to how the Wordpress Gutenberg editor may inject things:

Specifically in the 'plain JS' usage:

So using window.wp.element would map to a version of @wordpress/element, provided by the backend through the window.wp global:

Whereas in the non-static version, we can see that registerBlockType directly refers to the imported Edit / Save, which seem to handle their own imports, and/or use a JSX transform defined elsewhere in the build chain:

We can also see that the window.React global might come from Wordpress Gutenberg as well, as we can see from this example code that injects it:

We also get another clue here, where again window.React is injected into the function, and then a followup note to that:

So I guess, similar to the comment made in #143 (comment), the deeper specifics of this may belong in a separate plugin instead of webcrack core.

Though.. I do wonder if the window.React (assigned to a variable) usage is generic enough that it might make sense to include in core?


Edit 3: Edit of the edit note: I also created a standalone issue for this in case it ends up not being solveable within core, and needs a more library specific plugin based solution instead:

Looking at the code from #10 (comment) again, I think there is another case where JSX-like things may not be currently getting decompiled properly, which is syntax like this:

/* ..snip.. */
/* 541      */  var Z = h("div")`
/* 542      */    display: flex;
/* 543      */    justify-content: center;
/* 544      */    margin: 4px 10px;
/* 545      */    color: inherit;
/* 546      */    flex: 1 1 auto;
/* 547      */    white-space: pre-line;
/* 548      */  `;
/* ..snip.. */
/* 567      */  let c = t.createElement(Z, {
/* 568      */    ...e.ariaProps
/* 569      */  }, g(e.message, e));
/* ..snip.. */

Looking higher up in the file, we see the definition for h:

/* ..snip.. */
/* 106      */  function h(e, t) {
/* 107      */    let l = this || {};
/* 108      */    return function () {
/* 109      */      let i = arguments;
/* 110      */      function n(a, o) {
/* 111      */        let c = Object.assign({}, a);
/* 112      */        let s = c.className || n.className;
/* 113      */        l.p = Object.assign({
/* 114      */          theme: p && p()
/* 115      */        }, c);
/* 116      */        l.o = / *go\d+/.test(s);
/* 117      */        c.className = m.apply(l, i) + (s ? " " + s : "");
/* 118      */        if (t) {
/* 119      */          c.ref = o;
/* 120      */        }
/* 121      */        let r = e;
/* 122      */        if (e[0]) {
/* 123      */          r = c.as || e;
/* 124      */          delete c.as;
/* 125      */        }
/* 126      */        if (w && r[0]) {
/* 127      */          w(c);
/* 128      */        }
/* 129      */        return y(r, c);
/* 131      */      }
/* 132      */      if (t) {
/* 133      */        return t(n);
/* 134      */      } else {
/* 135      */        return n;
/* 136      */      }
/* 137      */    };
/* 138      */  }
/* ..snip.. */

And searching GitHub code for / *go\d+/.test leads us to the

Which we can then also see additional confirmation for in earlier code as well:

/* ..snip.. */
/* 6        */  let i = e => typeof window == "object" ? ((e ? e.querySelector("#_goober") : window._goober) || Object.assign((e || document.head).appendChild(document.createElement("style")), {
/* 7        */    innerHTML: " ",
/* 8        */    id: "_goober"
/* 9        */  })).firstChild : e || l;
/* ..snip.. */

Which seems to be used across a number of libs/projects:

Sometimes inlined directly:

This may end up being another case where, similar to the comment made in #143 (comment), the deeper specifics of this may belong in a separate plugin instead of webcrack core; but it makes me wonder if there is some kind of generic way we can identify a pattern of these sort of React component generator libraries so that the JSX decompilation can work effectively with them?

Similar'ish prior art from wakaru:

Looking back at the main format of the styled function (which was Z in the above code):

This returns an inner wrapper function, which seems to use tagged template literal syntax to provide the CSS, and then it reads that from the arguments into _args:

It then uses the _args to create the CSS class name:

And then processes the tag (eg. "div") passed to the original function:

Eventually 'rendering' that through the 'pragma' h:

Which was assigned during setup earlier:

Tracing through the code in our bundle to find that 'pragma' function binding, we find t.createElement ends up being assigned to h (or y as it's called in our minified code):

/* ..snip.. */
/* 582      */  (function (e, t, l, i) {
/* 583      */    c.p = undefined;
/* 584      */    y = e;
/* 585      */    p = undefined;
/* 586      */    w = undefined;
/* 587      */  })(t.createElement);
/* ..snip.. */

And of course, we know that t relates to our React global:

/* ..snip.. */
/* 2        */  var t = window.React;
/* ..snip.. */

This obviously ends up going through a few extra steps of more library specific indirection that probably doesn't make sense to be in webcrack core.. but I wonder if we're able to trace/follow the React global / createElement 'pragma' / h through so that JSX decompilation can work correctly?

In the case of this library it also inserts the additional wrapping component Styled in the middle.. but I think if the createElement 'pragma' flowed through properly.. that might end up being properly figured out as nested JSX anyway; as the Styled just ends up wrapping our provided tag component:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request jsx webpack
Projects
None yet
Development

No branches or pull requests

3 participants