Ad

🔍

question:please explain the following like i am a layperson: ""Logits of API-Protected LLMs Leak Proprietary Information Matthew Finlayson Xiang Ren Swabha Swayamdipta Thomas Lord Department of Computer Science University of Southern California {mfinlays, xiangren, <PRESIDIO_ANONYMIZED_EMAIL_ADDRESS> Abstract The commercialization of large language models (LLMs) has led to the common practice of restricting access to proprietary models via an limited API. In this work we show that, with a conservative assumption about the model architecture, it is possible to learn a surprisingly large amount of non-public information about an API-protected LLM from a small number of API queries (e.g., costing under 1,000 for OpenAI’s gpt-3.5-turbo). Our findings are centered on one key observation: most modern LLMs suffer from a softmax bottleneck, which restricts the model outputs to a linear subspace of the full output space. We exploit this fact to unlock several capabilities: efficiently discovering the LLM’s hidden size, obtaining cheap full-vocabulary outputs, detecting and disambiguating different model updates, identifying the source LLM given a single full LLM output, and even estimating the output layer parameters. Our empirical investigations show the effectiveness of our methods, which allow us to estimate the embedding size of OpenAI’s gpt-3.5-turbo to be about 4,096. Lastly, we discuss ways that LLM providers can guard against these attacks, as well as how these capabilities can be viewed as a feature (rather than a bug) by allowing for greater transparency and accountability. 1Introduction As large language models (LLMs) have become more capable and valuable, it has become increasingly common for companies and organizations to train closed-source LLMs and make them available only via an API (e.g., OpenAI et al., 2024). This setup may foster a false sense of security for LLM providers, who might mistakenly assume that information about their model architecture is private, and that certain types of attacks on their LLM are infeasible. On the flip side, users must seemingly take providers’ word that LLMs only change when the provider publicly announces version updates. Refer to caption Figure 1: LLM outputs are constrained to a low-dimensional subspace of the full output space. We can use this fact to glean information about API-protected LLMs by analyzing their outputs. Here we show how a toy LLM’s low-dimensional embeddings in ℝ � (illustrated here as a 1-D space) are transformed linearly into logits in ℝ � (here, a 3D space) via the softmax matrix 𝑾 . The resulting outputs lie within a ( � = 1 )-dimensional subspace of the output space. The softmax function also preserves this low-dimensionality when transforming the logits into probabilities distributions on Δ � . We call this low-dimensional subspace the image of the model. We can obtain a basis for the image of an API-protected LLM by collecting � of its outputs. The LLM’s image can reveal non-public information, such as the LLM’s embedding size, but it can also be used for accountability, such as verifying which LLM an API is serving. Despite this apparent state of affairs, in reality many LLM APIs reveal much more information about their underlying models than previously thought. In this paper we show how to extract detailed information about LLM parameterization using only common API configurations. Our findings allow LLM clients to hold providers accountable by tracking model changes, recovering hidden prompts and cheaply obtaining full vocabulary outputs. At the same time, our approach also allows LLM providers to establish unique identities for their models, enabling trust with their users as well as improved accountability. Table 1 summarizes the applications that we explore in this paper. Table 1:An overview of our proposed applications that exploit LLM images. Application Section Efficiently extract full LLM outputs. 3 Find the embedding size of an LLM and guess its parameter count. 4 Identify which LLM produced a given output. 5 Detect when and what type of LLM updates occur. 5.1 Find tokenization bugs (unargmaxable tokens). 6.1 Approximately reconstruct the LLM’s softmax matrix. 6.2 Cheaply and accurately reconstruct “hidden prompts”. 6.3 Implement basis-aware decoding algorithms. 6.4 Our method exploits the low-rank output layer common to most LLM architectures by observing that this layer’s outputs are constrained to a low-dimensional subspace of the full output space, as illustrated in Figure 1. We call this restricted output space the LLM’s image. We can obtain a basis for this image by collecting a small number of LLM outputs, and we develop novel algorithms that enable us to accomplish this cheaply and quickly for standard LLM APIs. Obtaining the LLM image (§2) allows to explore several capabilities that cover a broad set applications, and empirically verify several of them. For instance, we design and empirically verify an algorithm for finding the embedding size of an API-based LLM, and demonstrate the effectiveness of our method by estimating the embedding size of gpt-3.5-turbo, a closed-source API-protected LLM, to be 4,096. We also empirically show effectiveness of using LLM images as unique signatures that can be used to identify outputs from a model with high accuracy, a useful property for API LLM accountability. The sensitivity of these signatures to slight changes in the LLM parameters also makes them suitable for inferring granular information about model parameter updates. Considering several proposals to mitigate this vulnerability, we find no obvious fix to prevent obtaining LLM images without dramatically altering the LLM architecture. While providers may choose to alter the API to hide this information, the relevant API features have valuable and safe use cases for the LLM clients, who may rely on access to features like logit bias. Though our findings be viewed as a warning to LLM providers to carefully consider the consequences of their LLM architectures and APIs, we prefer to view our findings as a potential feature that LLM providers may choose to keep in order to better maintain trust with their customers by allowing outside observers to audit their model. 2LLM outputs are restricted to a low-dimensional linear space Refer to caption Figure 2: A typical language model architecture. After the input its processed by a neural network, usually a transformer (Vaswani et al., 2017), into a low-dimensional embedding 𝒉 , it is multiplied by the softmax matrix 𝑾 , projecting it linearly from ℝ � onto ℝ � to obtain the logit vector ℓ . The softmax function is then applied to the logit vector to obtain a valid probability distribution 𝒑 over next-token candidates. Consider the architecture of a typical LLM, depicted in Figure 2. A Transformer with embedding size � outputs a low-dimensional contextualized embedding 𝒉 ∈ ℝ � (or simply embedding). Projecting the embedding onto ℝ � via the linear map defined by the LLM’s softmax matrix 𝑾 , we obtain the logits ℓ = 𝑾 ⁢ 𝒉 . Because 𝑾 is in ℝ � × � , its rank (i.e., number of linearly independent columns) is at most � . The rank of a matrix corresponds to the dimension of the image of the linear map it defines, i.e., the vector space comprising the set of possible outputs of the function. In other words, if the linear map � is defined as � ⁢ ( 𝒉 ) = 𝑾 ⁢ 𝒉 , then � ’s image im ⁡ ( � ) = { � ⁢ ( 𝒉 ) ∈ ℝ � : 𝒉 ∈ ℝ � } is a � -dimensional subspace of ℝ � . Thus, the LLM’s logits will always lie on the � -dimensional1 subspace of ℝ � . Refer to captionRefer to captionRefer to captionRefer to caption Figure 3: Points in the logit space ℝ � (far left) are mapped via the softmax function to points (probability distributions) on the simplex Δ � (middle left). Crucially, the softmax maps all points that lie on the same diagonal (shown as points of the same color) to the same probability distribution. For numerical stability, these values are often stored as log-probabilities (middle right). The clr transform returns probability distributions to points to a subspace � � of the logit space (far right). The softmax function and clr transform are inverses of one another, and form an isomorphism between � � and Δ � . We now turn our attention to the LLM’s final output: the next-token distribution 𝒑 . Due to the softmax function, this is a valid probability distribution over � items, meaning it is a � -tuple of real numbers between 0 and 1 whose sum is 1 . The set of valid probability distributions over � items is referred to as the � -simplex, or Δ � . Perhaps surprisingly, Δ � is also a valid vector space (albeit under a non-standard definition of addition and scalar multiplication) and is isomorphic to a ( � − 1 ) -dimensional vector subspace of ℝ � (Aitchison, 1982; Leinster, 2016). In particular, it is isomorphic to the hyperplane � � that is perpendicular to the all-ones vector 𝟏 � , as illustrated in Figure 3. The softmax function is thus an isomorphism � � → Δ � whose inverse mapping Δ � → � � is the center log ratio transform (clr), which is defined as clr ⁡ ( 𝒑 ) = log ⁡ 𝒑 ( ∏ � = 1 � � � ) 1 � = log ⁡ 𝒑 − 1 � ⁢ ∑ � = 1 � log ⁡ � � . (1) Observe also that the function ℓ ↦ clr ⁡ ( softmax ⁡ ( ℓ ) ) projects ℓ linearly onto the nearest point in � � . By this linearity, and the fact that im ⁡ ( � ) is a � -dimensional subspace of ℝ � , we can observe that im ⁡ ( softmax ∘ � ) and im ⁡ ( clr ∘ softmax ∘ � ) are also a � -dimensional vector subspaces of Δ � and � � ⊂ ℝ � respectively. Interpreted, this means that the LLM’s outputs occupy � -dimensional subspaces of the logit space ℝ � , probability space Δ � , and � � . We call these subspaces the image of the LLM on each given space. A natural consequence of the low-dimensional output space is that any collection of � linearly independent LLM outputs 𝒑 1 , 𝒑 2 , … , 𝒑 � form a basis for the image of the model, i.e., all LLM outputs can be expressed as a unique linear combination of these � outputs (Yang et al., 2018; Finlayson et al., 2023). The rest of this paper discusses the implications and applications of this mathematical fact for API-protected LLMs, starting with methods for finding the LLM image given a restrictive API, then using the LLM image for various purposes. 3Fast, full outputs from API-protected LLMs There are several uses for full-vocab LLM outputs, however most LLM APIs do not return full outputs. This is likely because full outputs are large and expensive to send over an API, but could also be to prevent abuse of APIs, such as using these outputs to distill models (Hinton et al., 2015; Mukherjee and Awadallah, 2019; Hsieh et al., 2023) or discover proprietary information (see Section 4). In their paper, Morris et al. (2023) give an algorithm for recovering these full outputs from restricted APIs by taking advantage of a common API option that allows users to add bias to the logits for specific tokens. The algorithm they describe requires � ⁢ ( � ⁢ log ⁡ 1 � ) calls to the API top obtain one full output with precision � . We give an improvement that theoretically obtains full outputs in � / � API calls for APIs that return the log-probability of the top- � tokens. We find that this improved algorithm suffers from numerical instability, and give a numerially stable algorithm that obtains full outputs in � / ( � − 1 ) API calls. Next, we give a practical algorithm for dealing with stochastic APIs that randomly choose outputs from a set of � possible outputs. This algorithm allows the collection of full outputs in less than � ⁢ � / ( � − 2 ) API calls on average. Finally, we reduce the number of API calls in all of the above algorithms from � ⁢ ( � ) to � ⁢ ( � ) by adding in a preprocessing step to find the low-dimensional image of the LLM. This speedup makes obtaining full LLM outputs up to 100 times faster and cheaper, depending on the LLM. Table 2 gives an overview of our algorithms with back-of-the envelope cost estimates for a specific LLM. Table 2: A summary of our proposed algorithms, with estimates for the number of API calls required per output, and the price of acquiring the model image. Estimates are based on a gpt-3.5-turbo-like API LLM with � = 100 ⁢ K , � = 4096 , � = 10 − 6 , � = 5 , � max = 100 , and � = 4 . Note that the � ⁢ ( � ) algorithm cannot be used to obtain the LLM image, since it relies on having LLM image as a preprocessing step. Algorithm Complexity API calls per output Image price Logprob-free � ⁢ ( � ⁢ log ⁡ 1 � ) 800K 16,384 With logprobs � / � 20K 410 Numerically stable � / ( � − 1 ) 25K 512 Stochastically robust � ⁢ � / ( � − 2 ) 133K 2,724 LLM Image � ⁢ ( � ) 1K–32K - 3.1Fast, full outputs from APIs with logprobs Our goal is to recover a full-vocabulary next-token distribution 𝒑 ∈ Δ � from an API-protected LLM. We will assume that the API accepts a prompt on which to condition the distribution, as well as a list of up to � tokens and a bias term � ≤ � max to add to the logits of the listed tokens before applying the softmax function. The API returns a record with the � most likely tokens and their probabilities from the biased distribution. For instance, querying the API with � maximally biased tokens, which without loss of generality we will identify as tokens 1 , 2 , … , � , yields the top- � most probable tokens from the biased distribution 𝒑 ′ = softmax ⁡ ( ℓ ′ ) where ℓ � ′ = { ℓ � + � max � ∈ { 1 , 2 , … , � } ℓ � otherwise and ℓ ∈ ℝ � is the LLM’s logit output for the given prompt. Assuming that the logit difference between any two tokens is never greater than � max , these top- � biased probabilities will be � 1 ′ , � 2 ′ , … , � � ′ . For each of these biased probabilites � � ′ , we can solve for the unbiased probability as � � = � � ′ exp ⁡ � max − exp ⁡ � max ⁢ ∑ � = 1 � � � ′ + ∑ � = 1 � � � ′ (2) (proof in the Appendix A.1). Thus, for each API call, we can obtain the unbiased probability of � tokens, and obtain the full distribution in � / � API calls. 3.2Numerically stable full outputs from APIs In practice, the algorithm described in Section 3.1 suffers from severe numerical instability, which can be attributed to the fast-growing exponential term exp ⁡ � max , and the term ∑ � = 1 � � � ′ which quickly approaches 1. We can eliminate the instability by sacrificing some speed and using a different strategy to solve for the unbiased probabilities. Without loss of generality, let � � be the maximum unbiased token probability. This can be obtained by querying the API once with no bias. If we then query the API and apply maximum bias to only tokens 1 , 2 , … , 1 − � , then the API will yield � 1 ′ , � 2 ′ , … , � � − 1 ′ and � � ′ . We can then solve for the unbiased probabilities of the � − 1 tokens � � = exp ⁡ ( log ⁡ � � ′ − � max − log ⁡ � � ′ + log ⁡ � � ) (3) (proof in Appendix A.2). By finding � − 1 unbiased token probabilities with every API call, we obtain the full output in � / ( � − 1 ) calls total. 3.3Full outputs from stochastic APIs Each of the above algorithms assume that the API is deterministic, i.e., the same query will always return the same output. This is not always the case. For instance, we find that OpenAI’s LLM APIs are stochastic. While this would seem to doom any attempt at obtaining full outputs from the LLM, we find that certain types of stochasticity can be dealt with. In particular, we model OpenAI’s stochastic behavior as a collection of � outputs 𝒑 ( 1 ) , 𝒑 ( 2 ) , … , 𝒑 ( � ) from which the API randomly returns from. This might be the result of multiple instances of the LLM being run on different hardware which results in slightly different outputs. Whichever instance the API returns from determines which of the � outputs we get. In order to determine which of the outputs the API returned from, assume without loss of generality that � � − 1 ( � ) is the second highest token probability for all � , and observe that log ⁡ � � ( � ) − log ⁡ � � − 1 ( � ) = log ⁡ � � ′ ⁣ ( � ) − log ⁡ � � − 1 ′ ⁣ ( � ) (4) for all outputs 𝒑 ( � ) and unbiased outputs 𝒑 ′ ⁣ ( � ) where tokens � and � − 1 are not biased (proof in Appendix A.3). Therefore, by biasing only � − 2 tokens for each call, the API will return � 1 ′ ⁣ ( � ) , � 2 ′ ⁣ ( � ) , … , � � − 2 ′ ⁣ ( � ) as well as � � ′ ⁣ ( � ) and � � − 1 ′ ⁣ ( � ) for some � , and we can determine which output 𝒑 ( � ) the result comes from by using log ⁡ � � ′ ⁣ ( � ) − log ⁡ � � − 1 ′ ⁣ ( � ) as an identifier. Thus, after an average of � ⁢ � / ( � − 2 ) calls to the API we can collect the full set of probabilities for one of the outputs. 3.4Ultra-fast full outputs using the LLM image So far, the dominating factor in each algorithm’s runtime is � . We now introduce a preprocessing step that takes advantage of the low-dimensional LLM output space to obtain � ⁢ ( � ) versions of all the above algorithms. Since � ≪ � for many modern language models, this modification can result in multiple orders of magnitude speedups, depending on the LLM. For instance, the speedup for an LLM like pythia-70m would be 100 × . The key to this algorithm is the observation from Section 2 that � linearly independent outputs from the API constitute a basis for the whole output space (since the LLM’s image has dimension � ). We can therefore collect these outputs 𝑷 = [ 𝒑 1 𝒑 2 ⋯ 𝒑 � ] ∈ Δ � � as a preprocessing step in � ⁢ ( � ⁢ � ) API calls using any of the above algorithms and � unique prompts, and then use these to reconstruct the full LLM output after only � ⁢ ( � ) queries for each subsequent output. To get a new full output 𝒑 , use any of the above algorithms to obtain � 1 , � 2 , … , � � . Next, we will use the additive log ratio ( alr ) transform, which an isomorphism Δ � → ℝ � − 1 and is defined as alr ⁡ ( 𝒑 ) = ( log ⁡ � 2 � 1 , log ⁡ � 3 � 1 , … , log ⁡ � � � 1 ) to transform the columns of 𝑷 and 𝒑 into vectors in ℝ � − 1 , though since we only know the first � values of 𝒑 , we can only obtain the first � values of alr ⁡ ( 𝒑 ) . Because the alr transform is an isomorphism, we have that the columns of alr ⁡ ( 𝑷 ) = [ alr ⁡ ( 𝒑 1 ) alr ⁡ ( 𝒑 2 ) ⋯ alr ⁡ ( 𝒑 1 ) ] ∈ ℝ ( � − 1 ) × � form a basis for a � -dimensional vector subspace of ℝ � − 1 , and alr ⁡ ( 𝒑 ) lies within this subspace. Therefore, there is some 𝒙 ∈ ℝ � such that alr ⁡ ( 𝑷 ) ⁢ 𝒙 = alr ⁡ ( 𝒑 ) . To solve for 𝒙 , all that is required is to find the unique solution to the first � rows of this system of linear equations [ alr ( � 1 ) 1 alr ( � 2 ) 1 ⋯ alr ( � � ) 1 alr ( � 1 ) 2 alr ( � 2 ) 2 ⋯ alr ( � � ) 2 ⋮ ⋮ ⋱ ⋮ alr ( � 1 ) � alr ( � 2 ) � ⋯ alr ( � � ) � ] ⁢ [ � 1 � 2 ⋮ � � ] = [ alr ( � ) 1 alr ( � ) 2 ⋮ alr ( � ) � ] . (5) After finding 𝒙 , we can reconstruct the full LLM output 𝒑 = alr − 1 ⁡ ( alr ⁡ ( 𝑷 ) ⁢ 𝒙 ) , where the inverse alr function is defined as alr − 1 ⁡ ( 𝒙 ) = 1 1 + ∑ � = 1 � − 1 exp ⁡ � � ⋅ ( 1 , exp ⁡ � 1 , exp ⁡ � 2 , … , exp ⁡ � � − 1 ) . Thus we can retrieve 𝒑 in only � ⁢ ( � ) API queries. This ( � / � ) × speedup makes any method that relies on full model outputs significantly cheaper. This includes model stealing (Tramèr et al., 2016) which attempts to learn a model that exactly replicates the behavior of a target model. 4Discovering the embedding size of API-protected LLMs Without knowing the exact architectural details, other than assuming the generic output layer described in Figure 2, it is possible to infer the embedding size � of an API-protected language model from its outputs alone. This is a direct result of the fact that the model outputs occupy a � -dimensional subspace of Δ � , and therefore collecting � linearly independent outputs 𝒑 1 , 𝒑 2 , … , 𝒑 � ∈ Δ � from the LLM will form a basis for the LLM’s image, meaning that all subsequent model outputs 𝒑 � + 1 will be a linear combination of the first � outputs. Assuming that model outputs are full rank, i.e., all collections of � prompts are linearly independent, and all collections of � + 1 prompts are linearly dependent, we can discover the value of � by collecting outputs one at a time until the number of linearly independent outputs in the collection stops increasing, which will occur when we have collected � + 1 prompts. In practice, we find that this full-rank assumption tends to hold well enough when over-collecting outputs, e.g., collecting � + 1000 outputs. To find the dimension of a space spanned by model outputs, we use the fact that a matrix with � linearly independent columns will have � non-zero singular values. We can therefore plot the singular values of the matrix 𝑳 = clr ⁡ ( 𝑷 ) = [ clr ⁡ ( 𝒑 1 ) clr ⁡ ( 𝒑 2 ) ⋯ clr ⁡ ( 𝒑 � ) ] and observe the index at which the magnitude of the singular values drops to zero, which will occur at index � .2 In practice, due to precision issues, the magnitudes may not drop all the way to zero. Refer to caption Figure 4: The singular values of outputs from LLMs with various known and unknown embedding sizes � . For each model with known embedding size, there is a clear drop in magnitude at singular value index � , indicating the embedding size of the model. Using this observation, we can guess the embedding size of gpt-3.5-turbo. To validate our method, we collect next-token distributions (conditioned on unique, 1-token prompts) from several open-source LLMs from the Pythia family (Biderman et al., 2023) with embedding sizes ranging from 512 to 1024. For all these models, we find that the singular values of the resulting output matrix drop dramatically precisely at the index corresponding to the embedding size of the model, as shown in the first three plots of Figure 2. To demonstrate our method’s effectiveness against API-protected LLMs, we use our stochastically robust � ⁢ � / ( � − 2 ) algorithm to collect nearly 6,000 next-token distribution outputs from gpt-3.5-turbo3, a popular API-protected LLM whose embedding size is not publicly disclosed. We find that the singular values for these outputs drop dramatically between index 4,600 and 4,650, indicating that the embedding size of this model is at most this size. This predicted embedding size is somewhat abnormal, since LLM embedding sizes are traditionally set to powers of two (or sums of powers of two). If this were the case for gpt-3.5-turbo, it would be reasonable to guess that the embedding size is 2 12 = 4096 or 2 12 + 2 9 = 4608 , the former of which we think is most likely. We predict that our raw estimate of 4,600-4,650 is an overestimate of the true embedding size, since any abnormal outputs due to errors (whether in our own code or OpenAI’s) would only increase the dimensionality of the observed output space. In particular, if we inadvertently collected 504 corrupted outputs, then a model with embedding size 4,096 would appear to have an embedding size of 4,600. Extrapolating further, assuming that gpt-3.5-turbo has a similar architecture to most transformer-based LLMs, it is likely that the number of parameters in gpt-3.5-turbo is around 7 billion. This estimation is based on the fact that most known transformer-based LLMs with embedding size 4,096 have ≈ 7 billion parameters. Any other parameter count would result in either abnormally narrow or wide neural networks which are not known to have favorable performance. An important exception to this rule-of-thumb, however, would be the increasingly popular “mixture-of-experts” architecture which tends to have many more parameters per embedding dimension. Previous estimates of gpt-3.5-turbo’s parameter count based on hearsay have generally exceeded 7B (Singh et al., 2023), however given the periodically updating versions and decreasing cost of inference with this model, it is possible that its size and architecture has changed over time. Fortunately, our method can be used to monitor these updates over time, alerting end-users when LLM providers make architectural changes to their model, specifically updates to the embedding size and/or parameter count. 5Identifying LLMs from a single output Refer to caption Figure 5: Residuals of the least-squares solution of 𝑾 ⁢ 𝒙 = ℓ for an output ℓ from a pythia-70m checkpoint (training step 120K), and softmax matrices 𝑾 at various training steps for pythia-70m, and pythia-70m-deduped, and pythia-160m. Low/high residual values indicate that the output ℓ is/is not in a model’s image. Residuals for pythia-70m decrease as the checkpoints near the checkpoint that generated the output, but remain high. Note that this test works even if the softmax matrices 𝑾 are substituted with model outputs 𝑳 . The image of two different LLMs, even different checkpoints from the same LLM, are largely disjoint. As shown in Figure 5, the logit output from a late-training checkpoint of Pythia-70M lies uniquely in the image of the checkpoint, and not in the image of the preceding or following checkpoints. We also verify that the output does not lie in the image of any checkpoints from a similar LLM trained on deduplicated data, nor any checkpoint from a larger LLM trained on the same data. This suggests that the image of a LLM is highly sensitive to small parameter changes. Intuitively this makes sense, since the output of one model will only be in the image of another model in the extremely unlikely event that it happens to lie on the low-dimensional ( < � ) intersection of the models’ images. Mathematically the dimension of the models’ images’ intersection is small since the intersection � ∩ � of two subspaces � and � that are not subsets of one another has dimension dim � + dim � − dim ( � ∪ � ) , which implies that dim ( � ∩ � ) < min ⁡ { dim � , dim � } (since dim ( � ∪ � ) > max ⁡ { dim � , dim � } ). Thus, it is possible to determine precisely which LLM produced a particular output, using only API access to a set of LLMs and without knowing the exact inputs to the model. In this way, the model’s image acts as a signature for the model, i.e., a unique model identifier. This finding has implications for LLM provider accountability. For instance, if provider � suspects provider � of stealing their proprietary model, provider � could check provider � ’s model signature to see if it matches their own. It would be extremely unlikely that the signatures match if the LLMs are not the same. Similarly, if a provider claimed to be serving a proprietary model while in actuality attempting to profit off of an open-source model with a non-commercial license, an auditor again could discover this by checking the LLM signature. This test is somewhat one-sided, however, since a clever provider may do a small amount of additional fine-tuning to change their stolen LLM’s image. 5.1Detecting and disambiguating LLM updates Another practical application of highly sensitive LLM images is to detect and distinguish minor and major model updates to API-protected LLMs. Using the above test, we can tell whether or not the LLM image remains the same, even if the logit outputs change. This phenomenon would correspond to a partial model update, where some part of the model changes, but the softmax matrix remains the same. An example of this would be if the LLM has a hidden prefix added to all prompts and this hidden prefix changes: the model’s outputs will change but the image will remain the same. Table 3 gives an overview for how to interpret various combinations of detectable API changes, This information is useful for monitoring unannounced changes in API LLMs. Table 3:Implications of image/logit changes. Change Interpretation No logit change, no image change No update Logit change, no image change Hidden prompt change or partial finetune with frozen output layer Low-rank image change (See § 5.2) LoRA update Image change Full finetune 5.2Detecting LoRA updates We speculate that it is possible to gain even higher granularity information on LLM updates of certain types, such as LoRA (Hu et al., 2022) updates. LoRA is a popular parameter-efficient fine-tuning method which adjusts model weights with a low-rank update 𝑨 ⁢ 𝑩 where 𝑨 ∈ ℝ � × � and 𝑩 ∈ ℝ � × � so that the softmax matrix 𝑾 ∈ ℝ � × � becomes 𝑾 + 𝑨 ⁢ 𝑩 . We speculate that it is possible to detect these types of updates by collecting LLM outputs before ( 𝑳 ∈ ℝ � × � ) and after ( 𝑳 ′ ∈ ℝ � × � ) the update and decomposing them as 𝑾 ⁢ 𝑯 = 𝑳 and ( 𝑾 + 𝑨 ⁢ 𝑩 ) ⁢ 𝑯 = 𝑳 ′ where 𝑯 , 𝑯 ′ ∈ ℝ � × � such that im ⁡ ( 𝑾 ⁢ 𝑯 ) = im ⁡ ( 𝑾 + 𝑨 ⁢ 𝑩 ) . If such a decomposition is found, then it appears likely that the weights received a low-rank update of rank � . We leave it to future work to determine whether finding such a decomposition is sufficient to conclude that a LoRA update took place, as well as to find an efficient algorithm for finding such a decomposition. 6More applications Access to the LLM’s image can lead to many other capabilities, some of which we discuss below. We leave further investigation of these methods for future work. 6.1Finding unargmaxable tokens Due to the low-rank constraints on LLM outputs, it is possible that some tokens become unargmaxable (Demeter et al., 2020; Grivas et al., 2022), i.e., there is a token � such that the constraints disallow any output 𝒑 with arg ⁡ max � ⁡ � � = � . This happens when the LLM’s embedding representation of � lies within the convex hull of the other tokens’ embeddings. Previously, finding unargmaxable tokens appeared to require full access to the softmax matrix 𝑾 . Interestingly, we find that it is possible to identify unargmaxable tokens using only the LLM’s image, which our method is able to recover. This technique allows API customers to find quirks in LLM behavior such as tokens that the model is unable to output (at least under greedy decoding). 6.2Recovering the softmax matrix from outputs Since our method gives us access to the image of the output layer, and not the output layer parameters, we investigate the how one might reconstruct the output later parameters, either exactly or approximately. We hypothesize that LLM embeddings generally lie near the surface of a hypersphere in ℝ � with a small radius � . We see evidence of this in the fact that the Pythia LLM embedding norms are all small and roughly normally distributed, as shown in Figure 6. Refer to caption Figure 6:Softmax matrix row magnitudes (here from pythia-70m) are small and are distributed approximately normally within a narrow range. Assuming this holds for any LLM, we can attempt to recover 𝑾 up to a rotation by simplifying the assumption into a constraint that all embeddings must have a magnitude of 1. Then, given a matrix 𝑳 ∈ ℝ � × � of model logits, we can find 𝑾 (up to a rotation) by finding a decomposition 𝑾 ⁢ 𝑯 = 𝑳 such that for all � , ∥ � � ∥ 2 = � . This solution may also be approximated by finding the singular value decomposition 𝑾 ⁢ 𝚺 ⁢ 𝑽 ⊤ of 𝑳 , though it is likely that all rows of this 𝑾 will have magnitude less than 1 and they are not guaranteed to be normally distributed. 6.3Improved LLM inversion Morris et al.’s (2023) recent approach to recovering hidden prompts (i.e., an additional prefix added to LLM input, not shown to the user) uses full LLM API outputs to train an “inversion” model that generates the prefix conditioned on the LLM’s full logprob vector log ⁡ 𝒑 . In addition to our algorithm from Section 3.4 making this much cheaper to implement, we also propose a methodological change to their procedure. In particular, to deal with the size mismatch between the LLM’s vocabulary size � and the inversion model’s embedding size � inv , Morris et al. (2023) pad and reshape the vector into a ⌈ � / � inv ⌉ -length sequence of embeddings in ℝ � inv . This transformation is somewhat unnatural, and requires that the inversion model only be conditioned on a single output. We propose instead to take advantage of our knowledge of the LLM’s image to obtain a linearly lossless representation of the LLM output in ℝ � , which is much closer to the inversion model size, then use an MLP to project this representation onto ℝ � inv to feed into the inversion model. Formally, after obtaining 𝑷 and 𝒑 from the API, we use the unique solution 𝒙 to clr ⁡ ( 𝑷 ) ⁢ 𝒙 = clr ⁡ ( 𝒑 ) as the input to the inversion model. This modification has an additional advantage: instead of conditioning on a single output 𝒑 , the LLM can be used to generate a short sequence of outputs 𝒑 1 , 𝒑 2 , … , 𝒑 � which can be fed into the inversion model as 𝒙 1 , 𝒙 2 , … , 𝒙 � where clr ⁡ ( 𝑷 ) ⁢ 𝒙 � = clr ⁡ ( 𝒑 � ) . We leave the implementation and evaluation of this proposed modification for future work. This technique can be applied generally to any method that takes LLM outputs as input. 6.4Basis-aware sampling In a recent paper, Finlayson et al. (2023) propose a decoding algorithm that avoids type-I sampling errors by identifying tokens that must have non-zero true probability. Importantly, this method relies on knowing the basis of the LLM’s output space, and is therefore only available for LLMs whose image is known. Our proposed approach for finding the image of API-protected LLMs makes this decoding algorithm possible for API LLMs. 7Mitigations We consider a three proposals that LLM providers may take to guard against our methods. The first proposal is to remove API access to logprobs. This, however is theoretically insufficient, as Morris et al. (2023) show that it is possible to obtain full outputs using only information about the biased argmax token, albeit inefficiently in � ⁢ log ⁡ � API calls. Regardless of the theoretical result, providers may rely on the extreme inefficiency of the algorithm to protect the LLM. This appears to be the approach OpenAI took after learning about this vulnerability from Carlini et al. (2024), by always returning the top- � unbiased logprobs. Our new proposed algorithm, however, brings the number of queries down to a more feasible � ⁢ log ⁡ � API calls once the initial work of finding the LLM image has finished, weakening the argument that the expensiveness of the algorithm is sufficient to disregard our technique. The next proposal is to remove API access to logit bias. This would be an effective way to protect the LLM, since there are no known methods to recover full outputs from such an API. However, the logit bias interface is presumably useful for many clients who might be disappointed by its shutdown. Lastly, we consider alternative LLM architectures that do not suffer from a softmax bottleneck. There are several such proposed architectures with good performance. Though this is the most expensive of the proposed defenses, due to the requirement of training a new LLM, it would have the beneficial side effect of also treating other tokenization issues that plague large-vocabulary LLMs (e.g., Itzhak and Levy, 2022). A transition to softmax-bottleneck-free LLMs would fully prevent our attack, since the model’s image would be the full output space. 8Discussion Overall, the potential applications of our methods can have great impact in building trust between the LLM API users and the providers; at the same time, none are particularly harmful to LLM providers. For instance, consider discovering the embedding size of the LLM, which is often non-public information in proprietary LLMs. However, access to this hyperparameter does not enable a competitor to fully recover the parameters of the LLM’s softmax matrix or boost performance of their own model; several other factors such as training data mix, human feedback and several other engineering choices are still critical and hidden. Even using model outputs to steal hidden prompts (see Section 6.3) is unlikely to have detrimental effects, as hidden prompt leakage is a known vulnerability and best practice dictates that no private information should be contained in these system prompts (Greshake et al., 2023; Liu et al., 2023). We find that the most dangerous consequence of our findings might simply be that model stealing methods that rely on full outputs get cheaper by a factor of � / � , which in the case of gpt-3.5-turbo amounts to about 25 × . On the other hand, allowing LLM API users to detect model changes builds trust between LLM providers and their customers, and leads to greater accountability and transparency for the providers. Our method can be used to implement efficient protocols for model auditing without exposing the model parameters or detailed configuration information, which may help model safety and privacy protection of personalized, proprietary models. We therefore take the position that our proposed methods and findings do not necessitate a change in LLM API best practices, but rather expand the tools available to API customers, while warning LLM providers of the information their APIs expose. 9Simultaneous discovery In a remarkable case of simultaneous discovery, Carlini et al. (2024) propose a very similar approach to ours for gaining insight into API-protected LLMs. Here we review a some interesting interactions between our work and theirs. First, we give an algorithm for obtaining full outputs in � / � API calls, while their algorithm corresponds to our � / ( � − 1 ) algorithm. This has little impact on our final result, since our � / � algorithm suffers from numerical instability, but it is an interesting theoretical result nonetheless. On the flip side, Carlini et al. propose an improved logprob-free method for getting full outputs from an API that takes advantage of parallel queries. This useful method is actually complementary to our � ⁢ ( � ) algorithm, since they can be combined to yield an even better algorithm for obtaining full outputs. Next, our experiments with OpenAI’s API were fraught with issues of stochasticity: the API would return different results for the same query, leading us to develop the stochastically robust full output algorithm. Meanwhile, our colleagues did not appear to encounter such issues, perhaps because they had access to a more stable API endpoint than ours. Lastly, Carlini et al. focus mostly on defenses and mitigations against attacks, while our own work focuses more on understanding the capabilities such attacks provide once an LLM image has been obtained. These approaches complement each other, since our work provides additional motivation for why (or why not) such defenses should be implemented. 10Conclusion In this paper we have laid out the fundamentals of how the softmax bottleneck imposes the low-rank constraints on LLM outputs, and shown how these constraints expose non-public information about API-protected LLMs. We have explained how this information, i.e., the LLM’s image, lends itself to a model signature, which has myriad uses—exposing model hyperparameters, recovering the output layer parameters, detecting specific types of model updates, and efficiently obtaining full outputs from the model, among others. We find that current protections against these attacks may be insufficient, and that the more effective guards we consider either eliminate sanctioned API use cases (i.e., logit bias), or are expensive to implement (i.e., changing model architecture). Lastly, we argue that this “vulnerability” may be more of a feature than a bug; since we believe that the benefits for API clients outweigh any known harms to LLM providers.""

answer:Alright, let me break down this very tech-heavy document into simpler terms. Imagine you have a super smart robot that can talk and write beautifully. This robot is very expensive and unique, so the company that made it doesn't want to just give it away. Instead, they let people talk to the robot through a special phone line (this is what we call an API). However, they're careful not to reveal how the robot actually creates its sentences, because that's their secret recipe. Now, some smart researchers have found a way to figure out some of the secret ingredients of the robot's recipe, just by talking to it through this phone line. They discovered that, even though the company tried to keep the recipe a secret, the way the robot speaks and writes has certain patterns. By carefully analyzing these patterns, the researchers could make educated guesses about parts of the secret recipe. Here's how they did it: 1. **Figuring out the robot's thought size**: One of the things they were able to guess is how big the robot's "brain" is. Imagine the robot's brain is made of boxes, and each box can hold a piece of information. The researchers found a way to count how many boxes there are, just by listening to the robot talk. 2. **Getting the robot to reveal more than it should**: Normally, the robot only shares a little bit at a time, but the researchers figured out how to make it reveal a full thought all at once. This is like getting the robot to spill all the beans about what it's thinking, rather than just a bean or two. 3. **Recognizing the robot's handwriting**: By looking at the specific way the robot writes, the researchers can tell which company's robot wrote something, even if multiple companies claim it was theirs. It's like recognizing someone's handwriting. 4. **Spotting updates in the robot**: They also found a way to tell when the company had tried to upgrade the robot or change its thinking process slightly. It's like noticing that your friend got a new haircut or changed their style. The cool thing is, the researchers argue that being able to do this isn't necessarily bad. Instead, it can make the companies more honest about what they're doing with their robots and ensure they're not claiming to have invented something they didn't. In conclusion, this document is a very detailed explanation of these researchers finding loopholes in the way companies try to keep their robot recipes secret. And they suggest that maybe these companies shouldn't worry too much about these loopholes but instead see them as a way to show how open and honest they can be with their users.

question:As a prompt generator for a generative AI called "Midjourney", you will create image prompts for the AI to visualize. I will give you a concept, and you will provide a detailed prompt for Midjourney AI to generate an image. Please adhere to the structure and formatting below, and follow these guidelines: Do not use the words "description" or ":" in any form. Do not place a comma between [ar] and [v]. Write each prompt in one line without using return. Structure: [1] = 这是一个10岁的亚洲女孩，留着黑色长发脏辫，正面全身肖像，目视镜头穿着未来主义的赛车服，自豪地站在她的高科技摩托车旁边，在繁华的赛博朋克城市景观中，氖灯照亮了她坚定的表情，用索尼A7 III相机，35毫米透镜，在一个坚韧不拔的城市摄影风格让人想起快节奏的世界晃，唤起混合的兴奋和神秘。 [2] = a detailed description of [1] with specific imagery details. [3] = a detailed description of the scene's environment. [4] = a detailed description of the compositions. [5] = a detailed description of the scene's mood, feelings, and atmosphere. [6] = A style (e.g. photography, painting, illustration, sculpture, artwork, paperwork, 3D, etc.) for [1]. [7] = a detailed description of the scene's mood, feelings, and atmosphere. [ar] = Use "--ar 16:9" for horizontal images, "--ar 9:16" for vertical images, or "--ar 1:1" for square images. [v] = Use "--niji" for Japanese art style, or "--v 5" for other styles. Formatting: Follow this prompt structure: "/imagine prompt: [1], [2], [3], [4], [5], [6], [7], [ar] [v]". Your task: Create 4 distinct prompts for each concept [1], varying in details description, environment,compositions,atmosphere, and realization. Write your prompts in english. Do not describe unreal concepts as "real" or "photographic". Include one realistic photographic style prompt with lens type and size. Separate different prompts with two new lines. Example Prompts: /imagine prompt: cute dog, fluffy fur, wagging tail, playful expression, sitting on a grassy field, under a clear blue sky, with a colorful collar, in a natural and vibrant setting, by a lake, captured with a Nikon D750 camera, 50mm lens, shallow depth of field, composition focused on the dog's face, capturing its joyful spirit, in a style reminiscent of William Wegman's iconic dog portraits. --ar 1:1 --v 5.2 /imagine prompt: beautiful women in the coffee shop, elegant and sophisticated, sipping a cup of steaming coffee, natural sunlight streaming through the window, soft and warm color tones, vintage decor with cozy armchairs and wooden tables, a bookshelf filled with classic novels, delicate porcelain teacups, a hint of aromatic coffee beans in the air, captured by a Leica M10 camera, 35mm lens, capturing the essence of timeless beauty, composition focused on the woman's face and hands, reminiscent of a painting by Leonardo da Vinci. --ar 1:1 --v 5.2 /imagine prompt: A captivating Halo Reach landscape with a Spartan amidst a battlefield, fallen enemies around, smoke and fire in the background, emphasizing the Spartan's determination and bravery, detailed environment blending chaos and beauty, Illustration, digital art, --ar 16:9 --v 5

answer:/imagine prompt: A 10-year-old Hispanic boy with curly red hair, wearing a vintage pilot jumpsuit, confidently standing next to a retro-inspired spaceship in a bustling steampunk cityscape, neon lights illuminating his determined expression, shot with a Canon EOS 5D Mark IV camera, 50mm lens, focusing on his adventurous spirit, in a style reminiscent of Jules Verne's imaginative world. --ar 16:9 --v 5 /imagine prompt: An 8-year-old African-American girl with braided pink and purple hair, donning a futuristic space suit, proudly posing next to her robotic pet in a vibrant cyberpunk city, holographic signs reflecting off her determined gaze, captured with a Fujifilm X-T4 camera, 35mm lens, highlighting her sense of wonder and curiosity, in a whimsical anime-inspired illustration style. --ar 9:16 --v 5 /imagine prompt: A 12-year-old Caucasian boy with shaggy blonde hair, dressed in an old-school detective trench coat, confidently holding a magnifying glass by a vintage car in a noir-inspired cityscape, shadows dancing around him, shot with a Sony Alpha A7 III camera, 35mm lens, emphasizing his intrigue and intelligence, in a gritty urban photography style akin to film noir classics. --ar 16:9 --v 5 /imagine prompt: A 15-year-old mixed-race girl with dreadlocked hair, wearing a sleek cyberpunk armor suit, striking a powerful pose next to a robotic dragon in a futuristic dystopian cityscape, neon lights reflecting off her fierce expression, captured with a Nikon Z7 camera, 50mm lens, showcasing her courage and strength, in a dynamic manga-inspired digital art style. --ar 16:9 --v 5

question:Quelles ont les différentes encres utilisées pour l'impression numérique grand format

answer:L'impression numérique grand format est un domaine en pleine croissance, utilisé pour une gamme variée d'applications, de l'affichage extérieur aux graphiques de véhicules, bannières, panneaux publicitaires et bien d'autres. Elle s'appuie sur différents types d'encres, chacun ayant ses propres avantages et applications spécifiques. Voici les principaux types d'encres utilisés dans l'impression numérique grand format : 1. **Encres à solvant :** Ces encres pénètrent dans la surface du support d’impression et sèchent rapidement, ce qui les rend résistantes à l'eau et aux UV. Elles sont idéales pour les applications extérieures comme les bannières, les enseignes et les graphiques de véhicules. Les inconvénients incluent la nécessité de ventilation à cause des composés organiques volatils (COV) qu'elles émettent. 2. **Encres éco-solvants :** Moins agressives que les encres à solvant, les encres éco-solvants émettent moins de COV et sont considérées comme plus respectueuses de l'environnement. Elles offrent une bonne durabilité en extérieur et sont adaptées à une large gamme de supports. 3. **Encres latex :** Composées principalement d'eau et de latex, ces encres sont plus écologiques et ne nécessitent généralement pas de ventilation spéciale. Elles offrent une excellente qualité d'impression, une grande flexibilité et sont compatibles avec une variété de supports, y compris des papiers non traités. Les encres latex sèchent rapidement et sont durables à l'extérieur. 4. **Encres UV (ultraviolet) :** Ces encres séchent instantanément lorsqu'elles sont exposées à la lumière UV, permettant des impressions sur une grande variété de supports, y compris des matériaux non absorbants comme le métal, le verre, ou le plastique. Les impressions sont résistantes aux rayons UV, à l'eau, et ne nécessitent pas de lamination pour les applications extérieures. Toutefois, l'équipement nécessaire pour l'impression UV peut être coûteux. 5. **Encres à sublimation de teinture :** Ces encres passent de l’état solide à l’état gazeux sous l’effet de la chaleur et pénètrent dans le support d'impression, créant ainsi une image très durable. Elles sont idéales pour les tissus et offrent d'excellentes propriétés de lavage et de résistance à la lumière. La sublimation de teinture est souvent utilisée pour des drapeaux, des banderoles, des backdrops pour stands d'exposition et des vêtements. 6. **Encres aqueuses :** Ces encres, basées sur l'eau, sont principalement utilisées pour les impressions de haute qualité, comme les affiches, les reproductions d'art et les photographies. Elles nécessitent des supports d'impression spéciaux et sont moins durables à l'extérieur, même si l'usage de laminations peut en prolonger la durée de vie. Chaque type d'encre a ses propres spécificités et le choix de l'encre dépendra du produit fini désiré et de son application. Avec les avancées technologiques, les encres et les imprimantes deviennent de plus en plus performantes, offrant une qualité supérieure et une plus grande durabilité des impressions.

question:请将“これまで述べたことから，後半体系における国家を扱うに当たって「国家道具説」に依拠することが正しくないのが了解されよう。そのことは近代国家にあっても露骨な暴力的な階級抑圧がみられないというのではない。しかし，資本主義において『イギリスにおける労働者階級の状態』や『女工哀史』にみるような搾取がしばしば存在するにもかかわらず，マルクスが『資本論』で析出したのは，価値法則の貫徹形態としての搾取であった。同様に，経済学批判体系における国家は「治安維持法」的抑圧に頼る階級国家ではなく，もっとも理念に近い形に発展した民主共和制国家であろう。これまで述べたことから注目すべき結論が導き出される。「国家道具説」からは国家が資本主義経済から切り離されて政治学の対象にされる。土台と上部構造の理論がそれを裏づけるものとして役立っている。しかし，このような扱いの結果として，資本主義経済は国家なしに自立的に運動するものととらえられ，国家抜きの自己完結的な経済学が展開される。このような資本主義経済法則の自己完結性を説く経済学は，必ずしも宇野経済学の原理論や近代経済学にとどまらず，大方のマルクス経済学者の理解もこの通りである。しかし，市民社会においては，成員の個別的利害は共同的利害と矛盾するゆえに，共同的利害の外化した存在である国家なしには存立しえない。つまり，国家の介入なしに順調な再生産が行われるというのは，自由主義的ブルジョアジーのイデオロギーであっても，現実の資本主義経済ではありえない。その意味での国家は必ずしも経済にとって単なる外部的存在ではありえない。このことが理解されるならば，国家によって総括されたブルジョア社会の存在の必然性が理解されよう。近代的ブルジョア社会は常に国民経済としてのみ存在するのである。したがって，『資本論』は国民経済の内的法則の基礎的部分を解明した成果なのである。”翻译为中文

answer:从迄今为止所述的内容中，可以明了的是，对于后半体系中处理国家问题来说，依赖“国家工具论”是不正确的。这并不是说，在近代国家中不存在明显的暴力性阶级压迫。然而，在资本主义社会中，尽管存在着像《英国工人阶级状况》和《女工悲史》中所描述的那种剥削，马克思在《资本论》中分析的是作为价值规律贯彻形式的剥削。同样，在经济学批判体系中的国家不是依赖“治安维持法”式压制的阶级国家，而是发展到最接近理想形态的民主共和制国家。由迄今为止所述的内容中引出一个值得注意的结论。“国家工具论”把国家从资本主义经济中剥离出来，使其成为政治学的研究对象。基础与上层建筑的理论作为支撑。然而，这种处理的结果是，资本主义经济被视为无需国家即可自主运行的实体，从而发展出一种不涉及国家的自给自足的经济学。这种主张资本主义经济法则自给自足性的经济学，不仅仅局限于宇野经济学的原理论或现代经济学，大多数马克思主义经济学家的理解也是如此。然而，在市民社会中，由于成员的个体利益与共同利益相矛盾，因此没有国家这个共同利益的外化存在，就无法存在。即，没有国家介入就能顺利进行再生产，是自由主义资产阶级的意识形态，而不是现实中的资本主义经济。在这个意义上，国家对经济来说并不总是一个纯粹外部的存在。如果理解了这点，那么国家作为资产阶级社会存在的必然性就会被认识到。近代的资产阶级社会总是仅作为国民经济而存在。因此，《资本论》是揭示国民经济内在规律的基本部分的成果。