In the first episode of this series we discussed stack unwinding in the easiest case: languages that are compiled ahead-of-time to native code, by a compiler that follows platform conventions. The most important of these, at least on Linux, is that it emit .eh_frame metadata, a format for instructions on how to unwind a frame from any instruction, invented for C++ exception handling but also essential for profilers and debuggers.
Had the industry stuck with these straightforward languages, our unwinder would have been a finished and fossilized project long ago. Alas, programmers are generally unwilling to handicap their own productivity to make the lives of profiler developers easier. Over the last thirty years or so, led by Java, Python, and JavaScript, languages with execution models other than ahead-of-time native compilation have gone from relatively niche tools with a minority of dedicated fans to being the dominant general-purpose application languages in use today. Therefore, any profiler has to either support them, or be severely constrained in its usefulness.
In this article, we focus on JavaScript as implemented by V8 (the engine used in both Chrome and Node.js), and explain how parca-agent unwinds stacks, and associates the frames it discovers with human-readable function names and line numbers.
V8 basics: interpretation and compilation
All user-supplied JavaScript code must be parsed by V8 into bytecode for a virtual-machine interpreter called Ignition before it can be executed. Ignition is a classic register-accumulator machine; its bytecode is very slow to execute relative to optimized machine code, but it has the advantage of being relatively simple and fast to produce.
If a function is called frequently, V8 may decide to compile its bytecode to machine code. In fact, V8 contains three separate compilers from Ignition bytecode to machine code, choosing between them based on a heuristic estimate of the best tradeoff between time required for compilation and time spent in execution. In order of complexity, these are:
-
Sparkplug, also often referred to as baseline: the simplest compiler, running the fastest but producing the slowest code (which is nevertheless much faster to execute than virtual bytecode). Sparkplug straightforwardly maps each Ignition bytecode to a chunk of native machine code, doing little to no optimization.
-
Maglev, an intermediate tier, substantially slower than Sparkplug, producing "pretty good" machine code.
-
TurboFan, an industrial-grade modern optimizing compiler. Just like Clang or GCC, it is very slow, and therefore only worth running for very hot functions.
Code in real-world applications can be in any of these tiers; our profiler must therefore handle all of them properly.
Stack layout and unwinding
Recall that for native code, where we can't rely on the presence of frame pointers, we instead use .eh_frame data for unwinding. In V8, the situation is exactly backwards. Because code is generated on the fly, there's no fixed binary in which .eh_frame data could have been included, the way there is for an ahead-of-time compiled executable. Fortunately, however, unlike in native code under certain combinations of compiler flags, Node.js stacks always have frame pointers, a property which does not depend on the compilation tier. Thus, to unwind the stack when we're stopped and asked to take a sample, we can simply use the algorithm expressed by the following Python-like pseudocode. Note that the register names used here correspond to the aarch64 architecture, but the situation on x86-64 is basically the same.
fp = registerValue('x29')
pc = currentProgramCounter # i.e., the address of the currently-executing instruction
while inCodeWithFramePointers(pc):
# we've found a frame executing pc, let's alert the collector
emitFrame(pc)
# Frame pointers are enabled, so the next frame's fp is available at the address given by
# _this_ frame's fp, and the next frame's pc follows it immediately.
fp, pc = readTwoPointersAtAddress(fp)
# continue unwinding native frames...
This approach -- just walking a chain of pointers -- is evidently much simpler than the complex logic required to read .eh_frame data. So are we done? Can we conclude that actually, profiling V8 code is even easier than profiling native code?
Unfortunately not, because the pseudocode above glosses over at least two difficult questions; to wit:
- The definition of the
inCodeWithFramePointersfunction. How can we tell when we're in a V8 frame, rather than a native one, so that we know to use frame pointers? We can't simply assume that all frames in a Node.js process are V8 frames: JavaScript code can call into native libraries, and there are interpreter C++ frames running above any JavaScript code. - A bare
pcis not very useful: users want to know what lines of code were executing, not just where they happened to be loaded in the address space. How can we translate thepcemitted here to a human-readable function name and line number within user JavaScript code?
We will treat these two questions in turn.
V8 interpreter detection
How do we know when we're in a V8 frame (whether interpreted or JIT-compiled), so that we know to use FP-based unwinding (and the V8-related symbolization techniques discussed later)?
For user-supplied code, this is relatively simple: when we detect a V8 interpreter process (which we do in perhaps the simplest way imaginable: by a regex against the binary name), we simply assume that every anonymous executable mapping in the memory space of that process contains JavaScript code. This heuristic works well enough for our purposes: since it's in an executable memory page, it's probably code of some kind, and since it's anonymous (as opposed to file-backed memory), it's probably not the interpreter itself (e.g. the node executable), which is mostly written in C++.
However, totally excluding the interpreter binary is actually too aggressive. The binary contains several "builtins": snippets of pre-assembled code built into the executable that performs various tasks. These range from very simple operations like adding two numbers, to more complex functionality (for example, Map.groupBy is implemented as a builtin). Despite being baked into the binary as native code, these builtins largely follow JavaScript code conventions: they maintain frame pointers, and they don't have associated .eh_frame instructions. So we need to treat them similarly to JavaScript code, not C++ internals. In fact, Ignition bytecode is entirely executed in builtins, so if we did not find them and treat them specially, we would not be able to unwind non-JITted code at all!
In order to find the range of code containing the builtins, we use a heuristic. The embedded code blob containing the builtins happens to be installed by the linker just after a particular function called DefaultSnapshotBlob(). Official Node.js builds tend not to strip symbols, so we can look up that function and find the first large area of code shortly after it that has no .eh_frame instructions. This seems to work quite well in practice.
V8 frame symbolization
This is where most of the complexity lies, as different types of code store the information needed to symbolize them (that is, to recover a function name and source line numbers) in different places.
Typically (glossing over some details of minor relevance), frames are either "standard" or "typed". Standard frames correspond to JavaScript code, regardless of its compilation tier, whereas typed frames are created by the interpreter for various purposes (like crossing the JavaScript-to-Native boundary, or invoking a constructor via new). Typed frames are so called because they contain a "frame type marker" at a known offset from the frame pointer.
As already stated, standard frames correspond to JavaScript code; they contain a pointer to a JSFunction -- the V8 internal structure representing JavaScript functions -- again at a known offset from the frame pointer.
These JSFunctions contain pointers to other objects such as SharedFunctionInfo, Code, and ScopeInfo, which together contain various information necessary for execution and debugging, like a function's name, Ignition bytecode, currently-installed optimized native code (if any), and tables allowing for conversion from offsets within those arrays to source information.
Armed with this information, we can construct a more accurate algorithm for unwinding and symbolization. When a V8 frame is encountered in the unwinder, we first check the known location of the frame type in typed frames. If we find a number there, we know it is a frame type and report this information to the unwinder; it will use the reported frame type to construct a name for the frame. Otherwise, if it is a pointer, we know we are instead dealing with a JavaScript frame. Two fortunate facts make this analysis possible: first, that JavaScript frames hold a pointer to their "context" (whose meaning doesn't concern us here, only that it's a pointer) in the same slot where typed frames hold their type marker. Second, that pointers and numbers in V8 are both tagged, making it possible to reliably distinguish them.
For JavaScript frames, we get the JSFunction object at the fixed offset mentioned above, and load its SharedFunctionInfo and (if available) Code objects. We use information in those objects, as well as further hints from the frame layout, to determine whether we are dealing with Ignition bytecode, compiled but unoptimized code (i.e. Sparkplug code), or compiled optimized code. We can also determine (because Code reports the beginning and end addresses of the loaded code blob) the offset within the function from the current program counter; we report all this information, as well as the address of the Code object itself, to the userspace agent, which can then walk its structures to find the function name and source line numbers.
Discovering offsets
At various points throughout this article, we've relied on the fact that we're able to discover offsets known statically to the interpreter: we've both mentioned these explicitly (the offset from the frame pointer of the JSFunction for JavaScript frames, or the frame type for typed frames), and assumed their existence implicitly (the offset within JSFunction at which Code and SharedFunctionInfo are stored, as well as the offsets in those structs of various information like the offset-to-source-line tables). How are these determined?
A simple but incomplete solution is to use debug information embedded in V8 itself. The V8 developers realized that being able to walk V8-specific functions was crucial for debugging, especially when debugging core dumps after a crash. They therefore include, at build time, hundreds of entries in the symbol table of the binary which point to various important constants: the values of important internal enum variants, the offsets of various important fields in internal structs, and so on. These begin with v8dbg_, and are stored as dynamic symbols, so that they aren't removed even if the binary is stripped. By reading the appropriate v8dbg symbols, we can discover most of the constants we need, but the solution is incomplete: not every possible internal field offset was ever included as a v8dbg symbol, only the ones thought to be most crucial. Unfortunately, some of the ones we rely on are not in that set. Furthermore, even information that was included sometimes disappears in certain versions, due to bugs. Thus, additional techniques are necessary.
To fill in the gaps in the v8dbg data, we primarily use manual matching on the version number. Some representative real-life Go code should illustrate the flavor:
if vms.Fixed.FirstJSFunctionType == 0 {
//nolint:lll
// The V8 InstaceTypes tags are defined at:
// https://chromium.googlesource.com/v8/v8.git/+/refs/tags/9.2.230.1/src/objects/instance-type.h#124
// which in turn is generated from the data at:
// https://chromium.googlesource.com/v8/v8.git/+/refs/tags/9.2.230.1/tools/v8heapconst.py#175
// Since V8 9.0.14 the JSFunction is no longer a final class, but has several
// classes inheriting form it. The only way to check for the inheritance is to
// know which InstaceType tags belong to the range.
numJSFuncTypes := uint16(1)
switch {
case d.version >= v8Ver(9, 6, 138):
// Class constructor special case
//nolint:lll
// https://chromium.googlesource.com/v8/v8.git/+/1cd7a5822374a49ab6767185e69119d0d3076840
numJSFuncTypes = 15
case d.version >= v8Ver(9, 0, 14):
// Several constructor special cases added
//nolint:lll
// https://chromium.googlesource.com/v8/v8.git/+/624030e975cb4384f877b65070b4e650a6acb1ef
numJSFuncTypes = 14
}
vms.Fixed.FirstJSFunctionType = vms.Type.JSFunction
vms.Fixed.LastJSFunctionType = vms.Fixed.FirstJSFunctionType + numJSFuncTypes - 1
}
That's a small sample; in reality, these defaults and fallbacks go on for about 200 lines of code. Some constants change frequently enough that maintaining these fallbacks would be too brittle and labor-intensive; for those cases, our agent analyzes the actual machine code of V8 functions. For example, if we need to know the offset of a particular field in a struct, we might find a function that loads that field from a pointer to the struct and returns it. Then, at runtime, for any version of V8, we can find this function by its name (at least in non-stripped binaries) and analyze the actual x86 or aarch64 instructions it contains to find a memory load at some offset from the function argument: the offset of that load is the offset of the field in the struct. These techniques are used sparingly for V8, but are required much more commonly for certain other unwinders, notably LuaJIT.
Acknowledgments and Conclusion
We hope this article has given you a taste of the type of work that goes into maintaining a production-grade profiler for a major programming language interpreter and JIT compiler. Other runtimes work differently in the details, but many of the techniques and principles are the same.
As usual, an AI usage disclaimer: Claude Code on the Opus 4.8 model checked this document for accuracy, finding minor inaccuracies and typos. I have also used Claude Code extensively to help research V8 and Node.js internals, both for this article and for technical work on the profiler. No AI was involved in writing or editing this article in any way other than those mentioned in this paragraph.
I will close by thanking and acknowledging the OTel Profiling community: the V8 unwinder we're currently using comes directly from the opentelemetry-ebpf-profiler project, and although we now collaborate significantly on maintaining it, it is in large part not our original work.