Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
35 changes: 23 additions & 12 deletions 3157_generative_extensions/generative-extensions.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,19 +17,30 @@ toc: true

# Introduction

Since the recent implementation of the reflection facilities proposed in [@P2996R3], we explored how the proposal would help fundamental challenges we are facing today. We believe reflection has great potential to solve important problems that we and the C++ community at large both face, and that a few specific enhancements on top of P2996 would help realize that potential. Based on our initial experience, this document (and its companions [@P3294R0] and [@P3289R0]) builds function synthesis capability on the foundation of P2996 that we think will help C++ reflection have a stronger and more timely positive impact on the state of affairs in the C++ language. We anticipate that C++ reflection, if sufficiently powerful, could dramatically reduce costs of developing and maintaining C++ code while at the same time improving code size, readability, compilation time, and execution speed. We expect to have a proof-of-concept implementation of our design available shortly.
Since the recent implementation of the reflection facilities proposed in [@P2996R3], we explored how the proposal would help addressing fundamental challenges we are facing today. We believe reflection has great potential to solve important problems that the C++ community faces at large and that a few specific enhancements on top of P2996 would help realize that potential. We anticipate that C++ reflection, if sufficiently powerful, could dramatically reduce costs of developing and maintaining C++ code while at the same time reducing code size, improving readability, compilation time, and execution speed. Based on our initial experience, this document and its companions [@P3294R0] (Code Injection with Token Sequences) and [@P3289R0] (consteval blocks) build function synthesis capabilities on the foundation of P2996 that we think will help C++ reflection have an even stronger and more timely positive impact. We expect to have a proof-of-concept implementation of our design available shortly.

C++ code from a variety of domains naturally leads to boilerplate. Proxy classes are commonplace, whether to interface different codebases or as an organic part of design. The guideline ["prefer composition over inheritance"](https://en.wikipedia.org/wiki/Composition_over_inheritance) leads to many forward-to-member functions. Other useful [design patterns](https://en.wikipedia.org/wiki/Design_Patterns) such as Visitor, Observer, Decorator, Adapter, and [Null Object](https://www.slideshare.net/tcab22/null-object-design-pattern-presentation) require de facto maintenance of parallel class hierarchies and/or parallel function declarations (plus in many cases mechanical definitions). Foreign language/API interfaces typically consist of large swaths of code following repetitive patterns. Implementing high-performance parallel algorithms typically requires specialized patterns for a variety of size, stride, and type combinations. All of these instances feature enough irregularities and variations to make existing template metaprogramming techniques difficult to deploy, and if deployed, difficult to understand and maintain.
C++ code from a variety of domains naturally leads to boilerplate:
* Proxy classes, for example, are commonplace, whether to interface different codebases or as an organic part of design.
* The guideline ["prefer composition over inheritance"](https://en.wikipedia.org/wiki/Composition_over_inheritance) leads to many forward-to-member functions.
* Other useful [design patterns](https://en.wikipedia.org/wiki/Design_Patterns) such as Visitor, Observer, Decorator, Adapter, and [Null Object](https://www.slideshare.net/tcab22/null-object-design-pattern-presentation) require the replication of another type's interface and thus de facto maintenance of parallel class hierarchies and/or parallel function declarations (plus in many cases mechanical definitions).
* Foreign language/API interfaces typically consist of large swaths of code following repetitive patterns.
* Implementing high-performance parallel algorithms typically requires specialized patterns for a variety of size, stride, and type combinations.
* ...

The rest of this paper is structured as follows. Section "Function Descriptor Metafunctions" proposes a design for manipulating the reflection of functions, with an emphasis on generation capabilities. These capabilities—in conjunction with P2996, P3294, P3289, and [P3096](https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2024/p3096r0.pdf)—provide a mechanism for querying and synthesizing function definitions in a powerful and flexible manner that makes reflection-based code simple and intuitive. Section "Proxy Classes and Instrumented Classes" defines essential use cases of reflection metaprogramming that motivate and inform the design and implementation of function synthesis. We consider strong use cases essential to setting goals for reflective metaprogramming. To be compelling, use cases must demonstrate meaningful, desirable functionality (albeit in a simplistic, proof-of-concept style) that is impossible or prohibitively difficult within the current C++. Section "Embedded Domain Specific Languages" is even more forward-looking, setting up the long-range trajectory of our proposal.
All of these instances feature enough irregularities and variations to make existing template metaprogramming techniques cumbersome to deploy, and if deployed, difficult to understand and maintain.

The rest of this paper is structured as follows:
* Section "Function Descriptor Metafunctions" proposes a design for manipulating the reflection of functions, with an emphasis on generation capabilities. These capabilities—in conjunction with P2996, P3294, P3289, and [P3096](https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2024/p3096r0.pdf) (Function Parameter Reflection)—provide a mechanism for querying and synthesizing function definitions in a powerful and flexible manner that makes reflection-based code simple and intuitive.
* Section "Proxy Classes and Instrumented Classes" defines essential use cases of reflection metaprogramming that motivate and inform the design and implementation of function synthesis. We consider strong use cases essential to setting goals for reflection-based metaprogramming. To be compelling, use cases must demonstrate meaningful, desirable functionality (albeit in a simplistic, proof-of-concept style) that is impossible or prohibitively difficult within the current state of the C++ language.
* Section "Embedded Domain Specific Languages" is even more forward-looking, setting up a long-range trajectory of our proposal.

# Function Descriptor Metafunctions

This proposal is a companion to [P3294](https://brevzin.github.io/cpp_proposals/3294_code_injection/p3294r0.html) and meant to complement and work in conjunction with it. P3294 uses `decl_of(info)` (where `info` is the reflection of a function) as a key mechanism to expand reflections of existing functions into declarations, to which user code can subsequently attach definitions. This proposal focuses on creating and manipulating reflections of functions, to be later used with `decl_of(info)` as per P3294.
This proposal accompanies the code injection facilities proposed in [P3294](https://brevzin.github.io/cpp_proposals/3294_code_injection/p3294r0.html) and is meant to complement and work in conjunction with it. P3294 uses `decl_of(info)` (where `info` is the reflection of a function) as a key mechanism to expand reflections of existing functions into declarations, to which user code can subsequently attach definitions. This proposal focuses on creating and manipulating reflections of functions, to be later used with `decl_of(info)` as per P3294.

Given an existing function, we propose a number of *function descriptor metafunctions* that allow querying all aspects of it. Similar functions can be used to synthesize new functions.
We propose a number of *function descriptor metafunctions* that allow querying and mutating all aspects of an existing function, leading to similar functions which can be used to synthesize new functions.

Given a function declaration `f`, its reflection `^f` is a mutable object that can be subsequently modified. That does not affect the initial declaration in any way, but can be used to generate new declarations and definitions. Using `^f` once again returns a new copy of its reflection, so there is no loss of information.
Given a function declaration `f`, its reflection `^f` is a mutable object of type `std::meta::info` that can be subsequently modified. Modifications do not affect the initial declaration in any way, but can be used to generate new declarations and definitions. These can be injecte using the facilities described in P3294, or similar injection mechanisms. Evaluating `^f` once again returns a new copy of its reflection, so there is no loss of information.

The definitions of the proposed metafunctions are shown below. A subsection will be dedicated to describing each.

Expand Down Expand Up @@ -209,26 +220,26 @@ Consider the task of defining interface classes for API integration, including f

Such use cases generalize to creating *instrumented classes:* developing direct substitutes for existing classes while embedding specific hooks (such as tracing, logging, counters, argument verification, result validation, and naming convention changes) into some or all member functions. Successfully reflecting on and reconstructing a class in this manner serves as a critical test of a language's reflective metaprogramming capabilities, similar to how the identity function evaluates a language's functional programming features.

As a simple example, consider the Null Object design pattern, a safe alternative to the dreaded null pointer. Given an interface `T` that defines several pure virtual member functions, `null_object<T>` would yield an implementation of `T` that defines all of its virtuals to either throw an exception or return a default-constructed value of their result type. Defining a null object for a given interface is tediously simple, yet maintaining it is an exercise in frustration and a source of aggravation. Reflection should allow defining `null_object` for any type and behavior with ease.
As a simple example, consider the Null Object design pattern, a safe alternative to the dreaded null pointer. Given a type `T` defining an interface via several pure virtual member functions, `null_object<T>` would yield an implementation of `T` that defines all of its virtual functions to either throw an exception or return a default-constructed value of their result type. Defining a null object for a given interface is tediously simple, yet maintaining it is an exercise in frustration and a source of aggravation. Reflection should allow defining `null_object` for any type and behavior with ease.

A more involved example would be defining a class such as `instrumented_vector<T, A>` that wraps an `std::vector<T, A>` and adds instrumentation (e.g., bounds checking) to some or all of its methods. The key here is to make `instrumented_vector` a drop-in replacement for `std::vector` without incurring the cost of copying all of its member function declarations. Needless to say, defining such a proxy class by hand is quite discouraging, and reflective metaprogramming should offer a complete and flexible solution.
A more involved example would be defining a class such as `instrumented_vector<T, A>` that wraps an `std::vector<T, A>` and adds instrumentation (e.g., bounds checking) to some or all of its methods. The key here is to make `instrumented_vector` a drop-in replacement for `std::vector` without incurring the programming effort of copying all of its member function declarations. Needless to say, defining such a proxy class by hand is quite discouraging, and reflective metaprogramming should offer a complete and flexible solution.

The ability to create proxy classes would also put to rest the unpleasantness of following the adage "prefer composition over inheritance" in C++. As mentioned, in many composition situations, numerous forward-to-member functions must be written and maintained; automating these stubs would make it much easier to follow the guideline therefore improving code quality without adding to its bulk. Making valuable programming idioms more accessible has repeatedly proven to be a wise investment.
The ability to create proxy classes would also put to rest the unpleasantness of following the adage "prefer composition over inheritance" in C++. As mentioned, in many composition situations, numerous forward-to-member functions must be written and maintained; automating these stubs would make it much easier to follow the guideline, therefore improving code quality without adding to its bulk. Making valuable programming idioms more accessible has repeatedly proven to be a wise investment.

An important part of defining instrumented classes is querying all members of an existing class (static and nonstatic data member, regular and special member functions, `enum` declarations, friend declarations...) and generating similar definitions within the context of a new class definition. The `define_class` primitive in P2996R1 is the fundamental mechanism for implementing a proxy class, and although it currently does not support adding member functions, it alludes to such a possibility in section 4.4.12: "For now, only non-static data member reflections are supported (via `nsdm_description`) but the API takes in a range of `info` anticipating expanding this in the near future." We believe the ability to define full-fledged classes is a quintessential, defining feature of a reflective metaprogramming feature for C++. Here are a few key components needed:

- Signatures of all functions must be accessible for introspection, and primitives for accessing full information of a function’s signature must be defined.
- Signatures of all functions must be accessible for introspection, and primitives for accessing the full information of a function’s signature must be defined.
- Synthesis of function signatures must be possible, e.g. a library may need to build a signature from scratch, or from a similar signature (e.g., create a new signature from a given signature by adding or removing an attribute).
- There must be an ability to attach code to the reflection of a function signature; for example, a library may want to define a proxy class that inserts logging for each function’s arguments and result. The most fit candidate for attaching such functionality to reflection is a generic function literal that is a friend of the generated class.
- Finally, `define_class` would accept synthesized member functions in addition to (and in a manner similar to) `nsdm_description`. Implementing member function synthesis should be feasible following a design similar to `nsdm_description`&mdash;a function `memfun_description` would take an `std::meta::info` (either synthesized or coming from the introspection of another member function) a `memfun_options` object that has, among other members, a lambda function to serve as the body of the budding member function. The result of the call to `memfun_description` would be passed to `define_class`.

One aspect of function synthesis is *code cloning*&mdash;the ability to compile a reflected function template under different constraints (e.g. different concepts and attributes). As an example, this aspect is important to CUDA C++ libraries that need to add `__device__` attributes to all methods of replicas of standard types—such as `std::pair`, `std::tuple`, and `std::optional`&mdash;and subsequently compile the resulting code for use on the device. The alternative&mdash;copying and pasting code with minute changes&mdash;is a proverbially bad practice.
One aspect of function synthesis is *code cloning*&mdash;the ability to compile a reflected function template under different constraints (e.g. different concepts and attributes). As an example, this aspect is important to CUDA C++ libraries that need to add `__device__` attributes to all functions of replicas of standard types—such as `std::pair`, `std::tuple`, and `std::optional`&mdash;and subsequently compile the annotated functions into device code for use on GPUs. The alternative&mdash;copying and pasting code with minute changes&mdash;is a proverbially bad practice.

# Embedded Domain Specific Languages

Python’s many frameworks (e.g., PyTorch and TensorFlow, among many others) have demonstrated the importance of Embedded Domain Specific Languages (EDSLs) in today’s world, especially for AI applications. A reflective metaprogramming facility for C++ is expected to make it possible to express EDSLs much better than C++ currently allows.

EDSL-related features would place emphasis on the *generative* aspect of reflective metaprogramming. For example, an aspirational EDSL way of doing things for GPU-accelerated code could take in an algorithm written concisely in a high-level array language and generate during compilation specialized CUDA C++ code implementing the algorithm with the same efficiency as if this low-level code were written by hand.
EDSL-related features would place emphasis on the *generative* aspect of reflective metaprogramming. For example, an aspirational EDSL way of doing things for GPU-accelerated code could take in an algorithm written concisely in a high-level array language and generate during compilation specialized CUDA C++ code implementing the algorithm with the same efficiency as if this low-level code was written by hand.

The utility of EDSLs is, of course, much broader. An EDSL essentially allows the programmer to author a high-level expressive specification adapted to the problem domain (function differentiation, relational databases, networking protocols, regular expressions, EBNF grammars, document formatting...), to then generate C++ code from it. The approach is advantageous if writing the same C++ code by hand would be a much more costly proposition. Domain-specific libraries can provide the desired level of abstraction, but struggle to optimize execution in ways that cross abstraction boundaries. EDSL-style approaches can provide this missing capability.

Expand Down
Loading