Author responses to reviews to "The Community Inversion Framework v1.0: a uniﬁed system for atmospheric inversion studies"

. Atmospheric inversion approaches are expected to play a critical role in future observation-based monitoring systems for surface (cid:58)(cid:58)(cid:58)(cid:58) ﬂuxes (cid:58)(cid:58)(cid:58) of greenhouse gas (GHG)ﬂuxes, (cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58) pollutants (cid:58)(cid:58)(cid:58) and (cid:58)(cid:58)(cid:58)(cid:58)(cid:58) other (cid:58)(cid:58)(cid:58)(cid:58) trace (cid:58)(cid:58)(cid:58)(cid:58)(cid:58) gases. In the past decade, the research community has developed various inversion softwares, mainly using variational or ensemble Bayesian optimization methods, with various assumptions on uncertainty structures and prior information and with various atmospheric chemistry-transport models. Each of them can assimilate some 5 or all of the available observation streams for its domain area of interest: ﬂask samples, in-situ measurements or satellite observations. Although referenced in peer-reviewed publications and usually accessible across the research community, most systems are not at the level of transparency, ﬂexibility and accessibility needed to provide the scientiﬁc community and policy makers with a comprehensive and robust view of the main structure and functionalities of the system, and demonstrate how it operates in a simple academic case.

system that will bring a major software advances for future inversion frameworks. The presentation is clear and well written. I have few main major concerns detailed in the main comments below.

Main comments
The paper advertises and the multi-CTM or model capability, but this is not shown in the paper only mentioning a future paper.
This capability needs to be showcased in the paper or such claims should be removed or rephrased. 5 The CIF is a multi-purpose systems. The two main purposes are to enable the comparisons of different inversion methods with a given inversion case, and the comparison of different transport models for a given inversion configuration. We decided to split the two aspects into two different papers for clarity. We clarified the text accordingly p4. l. 14: In the present paper, we lay out the basis of the CIF, giving details on its underlying principles and overall implementation. The proof-of-concept focuses on the implementation of several inversion methods, illustrated with a test case. 10 We will dedicate a future paper to the evaluation of the system on a real-life problem with a number of interfaced atmospheric (chemistry-)transport models.
It is a bit limiting to narrow the scope only on GHG as inversion of reactive species and primary aerosols is quite relevant as well. Especially if CIF plans to, or already includes the CHIMERE CTM which is originally designed for air quality applications. 15 The CIF is primarily designed by members of the GHG community, thus the bias in presentation. However all the conceptual environment of the CIF can be applied directly to chemistry, as it is the case in CHIMERE, which solves complex chemical schemes reproducing the full chemistry of O3, NO2, VOC, CO, etc., as well as in LMDZ e.g. with simple chemical schemes around methane, CO or nitrous oxide. We clarified the text accordingly p3 l34: In particular, although primarily designed for GHGs applications, the CIF is based on a general structure that will 20 allow applications to air quality data assimilation.
Air quality applications are also mentioned throughout the introduction more explicitly.
The ensemble and analytical results point out possible problems in the implementation of the EnKF and in the analytical solution using correlation length scales.
There was indeed a difference in the way the correlations were computed in the analytical and the ensemble method, 25 versus the variational method. We fixed the issue and also updated the test cases to better highlight the consistency of the results.
It is not entirely clear on how the system uses the ensemble information to perform the inversion. This ensemble-based inversion should not be called an EnKF as it is not sequential. This needs to be revised or diagnosed or explained.
We agree that in this application there is no time-stepping in the EnKF solution and hence the word Filter is not 30 appropriate. However, the implementation of the solution methods allows it, and algorithmically follows the Ensemble Square Root Filter originally described in Whitaker and Hamil, (2002) and later used in Peters et al., (2005). The important distinction from the other methods is the use of the statistics of the ensemble to solve the Kalman equations, and the possibility to solve non-linear problems. We think the name EnSRF is therefore suited, and will use it in the revised version. We also include the note about the time-stepping.
The section on Ensemble methods (p8-10) has been extensively rewritten to clarify the difference between EnSRF and other ensemble methods. See track-change manuscript Finally, the testing of system using the Gaussian plumes on my local machine (Mac) was impossible for the time I gave in. 5 I regularly use python and successfully setup conda on my local machine. But it seems that the CIF is dependent on many specific packages that bring conflict issues and are not straightforward to install (specifically GDAL seems problematic). It required some research to find different commands to install the packages from what is indicated on the online documentation.
After trying for hours, I had to give up as my time to revise this paper is limited.
We are sorry to hear about difficulties to install our system. We will give more details on the installation in the 10 documentation web pages. To make the testing of our system easier, we have now set up a docker image (publicly available here: https://hub.docker.com/repository/docker/pycif/pycif-ubuntu) with all packages correctly installed and demonstration data available as well. Thus, new users can simply download the docker image and run it on their system, avoiding complicated configuration of libraries.
The docker image is provided with the corresponding configuration command lines to be run on a standard Ubuntu 15 system to obtain the same environment. These lines can be used to adapt to other OS and distributions.
More specifically, we agree that the GDAL library can create some issues when not correctly installed. However, GDAL is standard in geoinformation systems and such a library is essential for all geographical operations, such as reprojecting from one grid to another, or finding in which grid cell an observation point is. In very regular grids, such a library is not necessary, but the CIF is design to accommodate many transport models and input data, which are not 20 necessarily always on regular lon/lat grid.
The docker image is now mentioned in Sect 3.1 p.17, l19: To foster portability and dissemination, a dedicated Docker image is distributed with pyCIF, providing a stable environment to run the system and enabling full reproducibility of the results from one machine to the other. 25 -P3, L8: Define better what a unified system would be here.

Specific comments
Rephrased p3 l11 as: A unified system, as a community platform running multiple transport models, with diverse inversion methods, would provide ...
-P3, L12: OOPS main goals are on NWP. The CAMS system will run on OOPS (which is based on IFS) which will run inversions in the future. Maybe it is worth mentioning the CoCO2 projects (formerly CHE). The authors should also 30 mention the Joint Effort for Data Integration led by UCAR/JCSDA. https://www.jcsda.org/jcsda-project-jedi. The CAMS operational system will indeed run with OOPS, to be adapted for flux inversion purposes. The CIF is part of the CoCO2 project and is used as an interface between the operational community of CAMS and the research community. The CIF will also be used for the design of operational systems at the national and regional scales in Europe. We clarified this aspect and also mention the JEDI effort as follows: p3 l15: "Collaborative efforts towards unified systems are already happening in other data assimilation communities, with, e.g., the Object-Oriented Prediction System (OOPS; coordinated by the European Centre for Medium-range Weather Forecast, UK), or the Joint Effort for Data Integration (led by UCAR/JCSDA; https://www.jcsda.org/jcsda-5 project-jedi). The Data Assimilation Research Testbed (DART) is also an example of collective effort proposing common data assimilation scripts for diverse applications (e.g., Earth system, chemical species inversions)" -P3, L13-14: DART is an EnKF (EAKF to be more precise) that allows to run on the CESM earth system model. There is also a WRF mode. I would not call this unified or modular system compared to OOPS, CIF or JEDI. Also, DART can run atmospheric composition and recently chemical species inversions as well (see Gaubert et al., 2020). 10 We thank the reviewer for these insights and have updated the text accordingly (see reply above).
-P5, L10: Some of the models mentioned here use semi-lagrangian advection scheme. Maybe remove Eulerian and use a different word.
There was a mistake in our references indeed. We have now replaced Mizzi by Kuhlman et al. 2019. -Equation 2: Does N indicates a normal distribution? Please clarified. Also, in the second brace element, is this zero in the first moment? If yes, I guess this assumes unbiased observation and prior. I know this is a requirement for data 25 assimilation formulation.But often not the case in practice in atmospheric composition. Maybe you can detail and develop this in the text.
N indeed indicates a normal distribution. This is now stated in the text. We also clarified the underlying assumption of unbiased errors. This reads as follow now: " is to assume prior and observation spaces to be normal distributions, noted N (·, ·) below, the first argument repre- 30 senting the average of the distribution and the second argument the covariance matrix. In addition, when assuming that the distributions in the state vector space and the observation space are independent from each other, and that errors in the observation and the state vector spaces have Gaussian, unbiased distributions, it is possible mathematically derive the Bayes theorem and to represent the distributions of Eq. (2) as follows: The assumption that errors are unbiased, which makes it possible to write normal distributions in Eq. (1) with means x b , 0 and x a respectively, is needed to simplify the formulation of the Bayesian problem in atmospheric inversions.
Neglecting error biases have known impacts on inversion results (Masarie et al., 2011); they can be accounted for online as an unknown to be solved by the inversion (Zammit-Mangion et al., 2021), but are often treated offline from the 10 inversion, either through pre-processing of inputs or post-processing of outputs.
-P6, L22-25: This is an important point, maybe the authors could expand more on this for the non-data assimilation expert?
We give some more details, especially in terms of linearisation around a given point to approximate the observation operator by its Jacobian. 15 p7 l11: Analytical inversions can also be used on slightly non-linear problems, by linearizing the observation operator around a given reference point using the tangent linear of the observation operator. It formulates as follows: with δx a small deviation from x b within a domain where the linear assumption is valid, dH x b the tangent-linear of H at x b and H x b the Jacobian matrix of H at x b . 20 Then Eq. (3) can be easily adapted by replacing (y o − Hx b ) by (y o − H(x b )) and H by H x b .
The computation of an analytical inversion faces two main computational limitations. First, the matrix H representing the observation operator H must be built explicitly. This can be done either column by column, in the so-called response function method, or row by row, in the so-called footprint method. The two approaches require dim(X ), the dimension of the target space and dim(Y), the dimension of the observation space, independent simulations respectively. 25 In the response function method, each column is built by computing {dH x b (δx i ) \ ∀δx i ∈ B χ } with B χ the canonical basis of the target space. For a given increment δx i , the corresponding column gives the sensitivity of observations to changes in the corresponding component of the target space. In the footprint method, each row is built by computing H * x b (δy i ) \ ∀δy i ∈ B Y with B Y the canonical basis of the observation space. For a given perturbation of δy i of a component of the observation vector, the corresponding row of H gives the sensitivity of the inputs to that perturbation. 30 space as it is more generally used in atmospheric DA) or explain the differences.
We think that the use of "control vector" can be misleading in some context and prefer "target vector" which is also widely used in the community. We have checked for the consistency of the use of target v control vector over all the manuscript.

5
-P7, L12: This part of the sentence is not very clear. Replace "... problems such as the variational and the ensemble Kalman filter methods which are described below." We have replaced by: For this reason, smart adaptations of the inversion framework (including approximations and numerical solvers) are often necessary to tackle problems even when they are linear; in the following, we choose to elaborate on some of the 10 most frequent approaches used in the atmospheric inversion community: the variational approach and one ensemble method, the Ensemble Square Root Filter.
-P7, L18: "limited non-linearity": This is not necessarily true, or it is misleading. EnKFs are frequently applied successfully on chemical transport models which can be significantly non-linear systems in the polluted boundary layer for example. 15 Indeed, the ensemble methods can in principle handle non-linear systems, although we found in a CO2-application of the EnSRF that non-linearity in the observation operator quickly degrades the analysis (Tolk et al., 2011). We have revised the text.
In the current version, only the EnSRF approach is implemented in the CIF. One should note that the EnSRF, as a direct approximation of the analytical solution, can be very sensitive to non-linearity in the observation operator (e.g., Tolk 20 et al., 2011). It can generally cope only with slight non-linearity over the assimilation window, thus, the assimilation window length has to be chosen appropriately, contrary to other ensemble methods which are usually not sensitive to non-linearity.
-P7, L25: "running computation window": I believe the authors refer to what is commonly called assimilation window?
(Note that inversion is part of data assimilation techniques, so it is fine to call it data assimilation window in the inversion 25 framework). Please be consistent in the terminology in this paragraph. "Smaller", smaller than what? I understand that for GHG inversions especially CO2 you need long window given the observation network and the CO2 atmospheric lifetime, but this is not general to all atmospheric composition inversion.
We have clarified the terminology. For a given period of interest, EnSRF computes smaller running assimilation windows. The length of the assimilation windows can vary depending on the scale, on the species of interest, the 30 process of interest, the observations used, etc. All the ensemble methods section has been rewritten. needs to be proven. If the reference exists please add it. One can foresee that with a better coverage and higher satellite sensitivity towards the surface the window length could be reduced hence the number of observations to go through sequentially is reduced. Please correct or remove this statement or justify this more clearly.
We agree with the reviewer that this statement was too vague. The assimilation window must be shortened as 5 much as possible to limit the number of observations, but too short windows generate their own problems of independence of windows and propagation of information between windows. We have clarified to: for very dense observations, such as datasets from new-generation high-resolution satellites, the sequential assimilation of observations may not be sufficient, or at least methods may be needed to make the observation errors between sequential assimilation windows independent, for example by applying a whitening transformation to the observations to form a 10 new set with uncorrelated errors as suggested by Tippet et al., 2003. The challenge is exacerbated for long-lived species such as CO2, for which assimilation windows must be long enough to maintain the propagation of information on the fluxes on long distances through transport. Propagating an covariance matrix from assimilation windows to assimilation windows as accurate as possible could in principle limit the later issue, as suggested in Kang et al., 2011 and could still prove hard to apply in very high resolution problems.

15
-P8, L11: "under-estimating": This is not necessarily true. Small ensemble size can over and underestimate the uncertainty. More importantly they introduce spuriousness. Please clarified the statement.
We have clarified the statement. Small ensembles can cause the posterior ensemble to collapse, i.e., the posterior distribution is dominated by one or a very small number of members, which do not allow for a reliable assessment of the posterior uncertainties (Morzfeld et al., 2017); moreover, small ensembles introduce spuriousness in the 20 posterior uncertainties, with irrealistic correlations being artificially generated.
p10: By using random sampling, ensemble methods are able to approximate large dimensional matrices at a reduced cost without using the adjoint of the observation operator (see variational inversion below) that can be challenging to implement. Small ensembles generally cause the posterior ensemble to collapse, i.e., the posterior distribution is dominated by one or a very small number of members, which does not allow for a reliable assessment of the posterior 25 uncertainties (Morzfeld et al., 2017); moreover, small ensembles introduce spuriousness in the posterior uncertainties, with irrealistic correlations being artificially generated.. In the EnSRF, small ensembles rather cause a misrepresentation of uncertainty structures, which limits the accuracy of the computed solution, and may require fixes as described in, e.g., Bocquet (2011). In any case, the level of approximation necessary for this approach to work is strongly different for each problem, which requires preliminary studies before consistent application. In particular, the so-called localization of the 30 ensemble can be used to improve the consistency of the inversion outputs (e.g., Zupanski et al., 2007;Babenhauserheide et al., 2015).
In the current version, only the EnSRF approach is implemented in the CIF. One should note that the EnSRF, as a direct approximation of the analytical solution, can be very sensitive to non-linearity in the observation operator (e.g., Tolk et al., 2011). It can generally cope only with slight non-linearity over the assimilation window, thus, the assimilation window length has to be chosen appropriately, contrary to other ensemble methods which are usually not sensitive to non-linearity.
-P8, L13-14: Could you possibly provide few examples of different problems having different approximations?
We give two examples of past studies who discussed the use of the localization of the Monte Carlo ensemble to 5 help the filter keeping consistent results.
-P8 L15-28: I recommend revising this introduction of the variational method. Add a sentence to present the cost function.
Put the equation right after. Explain in few sentences that the aim is to minimize this cost function and that to achieve this the gradient is calculated. Then put the gradient formula.
We have re-arranged the variational section to clarify it. See track-change for detailed modifications. In particular, 10 we give details on how the posterior uncertainties are computed in variational inversions in the new version of the manuscript.
-P8, L25: I am not sure this is true. If the state vector is large, i.e. increasing resolution or augmenting the control variables, or increasing the number of observations, the minimization is expected to be slower.
The limiting factor in the analytical inversion approach is the size of H which cannot be broken down (unlike 15 B). The variational approach still has limitations but memory is unlikely to be a problem since B can be broken down, only the computational time could be a limiting factor.
Technically, crazily huge target vector will impede the capability of variational inversions to be computed, if only due to memory limitations, and convergence time. However, these limits come for much bigger problems than for analytical and ensemble methods. Also, the minimization algorithms such as CONGRAD and M1QN3 are 20 extremely efficient and manage a relatively good level of convergence with only a few tens of iterations even in very large dimensional problems (when correctly set up). For very large inversion problems (especially in the case of non-linearity), variational inversions still need thorough set up to guarantee the convergence.
-P12, Equation 12: Please, be explicit about what the index represents. "... of elementary interchangeable N transformations ..." 25 We now give more details about this equation. This generalization of the decomposition of the observation operators allow us to simplify the configuration and development of inversions. In particular, the development of new functions is well identified and makes the development of the adjoint simpler. p15 l15: In such a formalism, all intermediate transformations have the same conceptual level in the code. They are functions ranging from spatial reprojection, to temporal interpolations, to more complex operations such as the recon-30 struction of satellite total columns from concentrations simulated at individual levels in the transport model. In the CIF, all these transformations have the same input and output structure and, thus, their order can be changed seamlessly to execute a given configuration. Please note that the commutative property of elementary transformations as pieces of code does not guarantee the commutative property of the corresponding physical operators.
-P13, L10: It is a bit of pity that air quality application is only mentioned here. I think a system such as CIF could greatly benefit the air quality community as much as the GHG community, which are getting more connected lately.
We agree. As mentioned above, the CIF is being developed by members of the GHG community primarily.

5
However, air quality applications are also included in the purposes of the CIF, as e.g., the chemistry transport CHIMERE is implemented in the CIF with complex chemical schemes. AQ applications are now mentioned from the introduction.
-P13, L13: I am not sure what the authors meant by spatial gradient as observations?Please clarified or remove.
Some inversion studies assimilate differences between concentrations at different locations in the observation 10 vector (Breon et al. 2015), instead of individual observations. This approach is meant to limit the influence of background concentrations in the inversion of local fluxes. p15 l7: authors implemented differences between observation sites and time in the observation vector instead of observations from individual sites in order to focus on spatial/temporal gradients, thus allowing to limit the influence of background concentrations in the computation of local 15 -P13, L14-15: Maybe the authors could mention here that the main idea behind "super-obing" is to lower the discrepancies of the representativeness between model and observations.
We thank the reviewer for helping to clarify our text. We have added statements about the representativeness of the observations p16 l4: In the case of super-observations (satellites retrievals, images, spatial gradients, etc.) in the observation vector, 20 it is often necessary to combine multiple simulated point observations in given grid cells and time stamps into a single super-observation, to limit redundant observations, hence the size of the observation vector.
-P13, L15-16: "This is the case for...time and locations," This sentence is very unclear.Please rephrase or explain better.
We agree with the reviewer. We have split and clarified our statement. p16 l8: Super-observations are commonly used in the case for satellite observations being compared to all the model 25 levels above a given location; concentration gradients comparing observations at different time and locations (see e.g., Bréon et al., 2015;Staufer et al., 2016) are another example of observation aggregation to reduce representativeness errors. isotopic ratios are also super-observations as10they require to simulate separate isotopologues and recombine them after the simulation (as done in e.g., van der Velde et al., 2018;Peters et al., 2018) -P15, L21-31: What about satellite observations? I think the authors should add statement about this. 30 At the moment (in a version of the CIF more advanced than the one described in the present paper), satellite observations are included in the observation operator, but the user must format original files manually to a standard observation format readable by the CIF, with column, averaging kernels, etc. in a specific form. In the future, there will be a need to allow raw satellite formats automatically in the CIF to be ingested without manual formatting. p19 l1: satellite products, in particular the formatting of averaging kernels and other metadata, should also be included 5 in the CIF in the near future as they play a growing role in the community -P17, L11: Those studies use Gaussian plume technique but do not provide a continuous inversion constrain globally.
Please correct the statement as it makes the reader believes that the Gaussian plume technique would allow a global inversion similar to a CTM based inversion.
We fully agree with the reviewer and have clarified the sentence.  20 We have used centered color scales for the uncertainty matrix as plotted now. But as fluxes are not centered, we find that our color scales are the best compromise for readability.
-P20, L20: I am unsure if I understand correctly but I believe the authors are speaking about iterations in the variational sense and members in the EnKF sense. If yes, please have few sentences to explain what the word 'simulation' means.
Alternatively, I would replace the word 'simulations' by'iterations or members.' 25 The reviewer understood correctly. We clarified. We use this measure in the x-axis to be comparable between variational and ensemble methods, as one iteration of a variational algorithm requires two simulations (one forward and one adjoint). Details are given about the number of simulations in Sect. 4.2.2 p24 l30: More precisely, users and developers of adjoint transport models choose the number of forward re-computations to be carried out, based on a space-speed trade off: by saving all forward intermediate states, the adjoint is as costly as 30 the forward, but the disk space burden can be extremely challenging to manage, thus increasing the overall computation time in return.
provide very uniform structure on the increments. Structure similar to the 500m increments is seen on the result using the 10000m length scale. I expect there is typo through the text a 0 should be removed to 1000m?
This was indeed 10000m, but the high number of observation sites counter-balanced the long correlation lengths.
We have changed the set-up with less observation sites to have very uniform increments when having very large 5 correlation scales. This is a simple test to demonstrate the consistency of the definition of the correlations in the system. To make this test clearer, we now prescribe a correlation of 200 km, hence 100 times the size of the domain.
-P20,L33-P21,L2: This look to me more like a bug rather noise as vertical lines or "chessboard" like pattern appear in the increments results. Such pattern also occurs in the analytical solution. This not looking like sampling noise to me. I think this points out possible problems in the implementation of the EnKF and of the systems other than the variational 10 methods. It is not entirely clear if the system uses the ensemble information to derive jacobians for the calculation of H and perform a minimization? In this case this inversion cannot be called EnKF. This needs to be revised or diagnosed or explained.
Thanks for pointing this out. There was a small difference in the definition of the B matrix between analytical and ensemble methods on one side, and variational inversions in the other site. We have removed it and have 15 correspondingly re-computed the inversions.

Technical comments
-P2, L27-29: an order in all those references, chronological or alphabetical.
with V x a the dominant eigenvectors of the Hessian matrix at the point x a and Λ x a the matrix of the dominant eigenvalues of the Hessian matrix. Please note that the dominant eigenvalues of the Hessian matrix correspond to components with low posterior uncertainties in A.

5
Another approach to quantify the posterior uncertainty matrix A, valid for both linear and non-linear cases, is to carry out a Monte Carlo ensemble of independent inversions with sampled prior vectors from the prior distribution N (x b , B) (e.g., ? ). An ensemble of posterior vectors are inferred and used to compute the posterior matrix as follows: -P15, L18: I understand what the authors means but this need to be reformulated. "Meteos" is not really correct English.
This referred to the keyword used in the CIF, which is abbreviated for the sake of readability; we replaced the word in the body of the manuscript by "meteo-data".
-P19, L3-10: Maybe add the equations in the text where they are mentioned. 15 We re-aranged this section accordingly.
-P22, L6: This is not a fair statement considering on how the "EnKF" has been implemented here. It doesn't seem to be strictly an EnKF as the sequential assimilation is not performed. Consider changing or removing this in the conclusion. Also, this is not the scope of GMD paper here to compare methods but to describe and showcase model/software developments for geoscience. 20 We agree that a GMD "description paper" should not go too far in interpreting demonstration results. We have reformulated and reduced the comparison of methods. See track-change for modifications in the result section.
-P22, L10-12: The authors could add some sentences about the challenges in implementing actual CTMs in the CIF framework. For example, among many other challenges, I/O is a limiting factor for data assimilation especially with CTMs. 25 We agree that I/O can be a limiting factor. With its very general formalism, the CIF may not be able to accommodate very specific tweaks necessary for some models and some very large applications. However, using very efficient numerical libraries in python (which uses C in the background), one should be able to keep up the pace with systems with compiled codes (Fortran mainly in the community). We have added a paragraph in the conclusion to discuss this issue: The next step of the CIF is the implementation of a large variety of CTMs. The implementation of new CTMs already interfaced with other inversion systems should not bring particular conceptual challenges as all interface operations are 5 already written in their original inversion system; in most cases, re-arranging existing routines is sufficient to interface a model to the CIF. One particular challenge concerns I/O optimizations: the generation of inputs and the processing of outputs can be time consuming and in some very heavy applications requires specific numerical and coding optimizations. The very general formalism of the CIF may hamper the ability of applying these particular optimizations for some models. Best efforts will have to be deployed to take full advantage of advanced I/O and data manipulation libraries in 10 python to limit this weakness.

General Comments
This paper introduces a software system to facilitate ensembles of inverse calculations. The likely focus is atmospheric inversions in which model inputs such as surface fluxes are inferred from combinations of prior information and observations. 15 Such a system is clearly useful since ensembles based on choices of ingredients (such as models or input covariances) provide a better basis for both the means and uncertainties of posterior estimates. the paper is certainly within scope for GMD since it describes a system which might well be widely used in the future. It is also well written though some final proofreading, especially around pluralisation, will be needed.
I note I am not reviewing the software itself. There is only one major critique of the paper. I think the benefits of the system 20 are slightly over-sold (but what software re-lease doesn't do this). I believe, for example. that CIF doesn't provide a framework for comparing estimates where the target spaces differ such as ENKF with short windows vs long-term mean approaches. That's an intellectual/mathematical task which one could instantiate in software. This would probably involve classes representing PDFs which might be internally represented by ensembles or parameters and functions then defining transformations among these PDFs. That quibble aside there is no doubt the system will help such comparisons. 25 We agree that the CIF does not provide a plug'n'play solution to the problem of comparing totally different inversion set-ups in a statistically consistent way. Doing so would require further mathematical considerations that may prove necessary in the future to compare methods/models that cannot be computed on, e.g., exactly the same inversion window (i.e., when comparing inversion results for a given region, one computed with a regional model and the other with a global one). However, having a consistent framework with comparable configuration parameters and identical output 30 format will definitely help to develop such a comparison.
On the positive side the ability to configure the computational pipeline is a marvellous innovation. Building our own, less ambitious system, we have struggled to make it agnostic about computation.
Perhaps more important than supporting intercomparisons, the tools provided as part of the system will lower the entry barrier for new groups entering the field. This was a surprising spin-off from the TRANSCOM effort 20 years ago. Most systems since then have been too closely tied to a given model for universal use so CIF can provide a major boost to the field.

5
The fact that it is well documented will only help this.
There is one aspect, though, that seems seriously deficient. If I read correctly, methods for calculating uncertainty in the Gaussian posterior are not provided. One important role of a new framework like CIF is to educate, to encourage good practice.
That includes calculating and engaging with the uncertainty in the results. Certainly any of the analytic methods under-estimate this term but it is still true that under-sampling of concentration is a major source of uncertainty for inversions at all scales. I 10 request that the authors at least detail a road map for including posterior uncertainty.
We agree: this was missing. Of course the posterior uncertainty calculation depends on the approach used (analytical, CONGRAD, M1QN3, or EnKF), as well as how the error matrices are computed in the system. We have implemented the computation of the posterior error matrices for all methods and have re-arranged the manuscript to demonstrate the capability to provide uncertainty reduction and posterior error matrices. In particular, posterior matrices are shown 15 in the Result section.

Specific Comments
-P7 bottom this correlation of errors is a common critique of ensemble methods but I've rarely seen it implemented in a non-ensemble method either and it's perfectly possible to do in ensemble methods, one just adds some extra state variables carrying observational corrections and build the correct autocorrelations into them instead 20 We agree most variational inversions assume diagonal observation error matrices, or block diagonals, which is somehow equivalent to the running assimilation window of the Kalman Filter.
-P8 L22saying "the maximum probability" and "the mode" is tautological I think We have re-arranged the variational section following comments from reviewer #1 and have removed this tautology.

25
-P9 congratulations for explicitly writing out the regularisation/reduction step which is so often assumed. It is called pre-conditioning so often that you should add the term for clarity We now mention the term pre-conditioning for consistency with the terminology used in the community -P9 it's worth commenting that leading eigen-vectors of the Hessian correspond to the low uncertainty parts of the solution which might or might not be what you want 30 Thank you for pointing out this aspect. We now give details on the formula used to deduce the posterior matrix from the eigen vector of the Hessian and mention this.
p12 l8: The approximation of the posterior uncertainty matrix A in the case of CONGRAD reads as follows: with V x a the dominant eigenvectors of the Hessian matrix at the point x a and Λ x a the matrix of the dominant eigenvalues of the Hessian matrix. Please note that the dominant eigenvalues of the Hessian matrix correspond to components with low posterior uncertainties in A.

5
Another approach to quantify the posterior uncertainty matrix A, valid for both linear and non-linear cases, is to carry out a Monte Carlo ensemble of independent inversions with sampled prior vectors from the prior distribution N (x b , B) (e.g., ? ). An ensemble of posterior vectors are inferred and used to compute the posterior matrix as follows: -P11 citing the Bousserez and Henze paper is a good advertisement for your approach, you might want to say in the abstract or intro that many new methods are hybrid and this combined framework simplifies implementing such methods We insist on the advantage for modern hybrid methods of such a flexible structure. The decomposition of elementary operations makes it easier to recompose them in various ways. Besides, the CIF allows new methods to be 15 tested with multiple models, hence inquiring their strengths and weaknesses in various contexts.
p25 l14: With modern inversion methods moving towards an hybrid paradigm of variational and ensemble methods, the flexibility of the CIF will be a valuable asset as abstract methods can be easily mixed with each other.
-P15 a technical question, how do you handle target variables which aren't needed at various stages, e.g. bias corrections in observations which aren't inputs to the CTM? the pipe lining structure doesn't lend itself well to this problem. The mathematical formalism of Eq. (15) and (16) suggests that transformations are necessarily computed in a serialized way, thus limiting applications to simple target variables upstream the transport model. However, each elementary transformation handles components of the inputs it is concerned with, leaving the rest identical and forwarding it to later transformations. Typically, it does not actually limit applications to simple target variables upstream the CTM.
For instance, in the case of target variables optimizing biases in the observations, the corresponding components of the 5 target vector x are forwarded unchanged by all transformations in Fig. 2 until the very last operation, where they are used for the final comparison to the observation vector.
-P18 I think Eq. 17, do you consider sources as points or area integrals, if so do you need to integrate over the area of the source, especially for nearby observations?
The formulas used in Eq. 16 and 17 are based on point sources. The word "gridded" in the text is therefore 10 misleading. We have clarified this statement by: p21 l18: we use a grid of point surface fluxes to simulate concentrations using the Gaussian plume equation.
-P20 it's worth adding that the amount of intermediate calculation needed is at user discretion; a space-speed trade-off.
CMAQ, for example needs almost none, having check pointed what it needs on the forward run it had to do anyway at the start of the iteration. 15 We thank the reviewer for highlighting this aspect. We have extended the paragraph as follows to mention this aspect: p23 l29: More precisely, users and developers of adjoint transport models choose the number of forward re-computations to be carried out, based on a space-speed trade off: by saving all forward intermediate states, the adjoint is as costly as the forward, but the disk space burden can be impossible to manage, thus increasing the overall computation time in 20 return.
The Community Inversion Framework v1.0: a unified system for atmospheric inversion studies Correspondence: antoine.berchet@lsce.ipsl.fr Abstract. Atmospheric inversion approaches are expected to play a critical role in future observation-based monitoring systems for surface :::: fluxes ::: of greenhouse gas (GHG)fluxes, ::::::::: pollutants ::: and ::::: other :::: trace ::::: gases. In the past decade, the research community has developed various inversion softwares, mainly using variational or ensemble Bayesian optimization methods, with various assumptions on uncertainty structures and prior information and with various atmospheric chemistry-transport models. Each of them can assimilate some 5 or all of the available observation streams for its domain area of interest: flask samples, in-situ measurements or satellite observations. Although referenced in peer-reviewed publications and usually accessible across the research community, most systems are not at the level of transparency, flexibility and accessibility needed to provide the scientific community and policy makers with a comprehensive and robust view of the uncertainties associated with the inverse estimation of GHG ::: and ::::::: reactive :::::: species : fluxes. Furthermore, their development, usually carried out by individual research institutes, may in the future not keep pace with the increasing scientific needs and technical possibilities. We present here a Community Inversion Framework ::::::::: Community :::::::: Inversion :::::::::: Framework (CIF) to help rationalize development efforts and leverage the strengths of individual inversion systems into a comprehensive framework. The CIF is primarily a programming protocol to allow various inversion bricks to be exchanged among researchers. In practice, the ensemble of bricks 5 makes a flexible, transparent and open-source python-based tool to estimate the fluxes of various GHGs both at ::: and ::::::: reactive :::::: species ::::: both :: at ::: the : global and regional scales. It will allow running different atmospheric transport models, different observation streams and different data assimilation approaches. This adaptability will allow a comprehensively :::::::::::: comprehensive assessment of uncertainty in a fully consistent framework. We present here the main structure and functionalities of the system, and demonstrate how it operates in a simple 10 academic case.

Introduction
The role of greenhouse gases (GHGs) in climate change has motivated an exceptional effort over the last cou-
In the present paper, we lay out the basis of the CIF, giving details on its underlying principles and overall 15 implementation. The proof-of-concept focuses on the implementation of several inversion methods, illustrated with a test case. We will dedicate a future paper to the evaluation of the system on a real-life problem with a number of interfaced atmospheric (chemistry-)transport models. At the time of writing the present article, the following models are interfaced with the CIF: the Global Circulation Models LMDZ (Chevallier et al., 2005) and TM5 (Krol et al., 2005; van der Laan-Luijkx et al., 2017), the regional chemistry-transport 20 Eulerian model CHIMERE (?) ::::::::::::::::::::::::: (Fortems-Cheiney et al., 2021) and the Lagrangian Particle Dispersion :::::: particle :::::::: dispersion : models FLEXPART (Pisso et al., 2019) and STILT (Trusilova et al., 2010). For the sake of simplicity, we refer to all types of (chemistry-)transport models generically as CTMs in the following. In Section 2, we describe the general theoretical framework for atmospheric inversions and how the CIF will include the theory in a flexible and general way. In Section 3, the practical implementation of the general 25 design rules is explained, with details on the python implementation of the CIF. In Section 4, we demonstrate the capabilities of the CIF in a simple test case, applying various inversion techniques in parallel.
Therefore, we propose here a general and flexible structure for our system that will be independent of limiting assumptions, as described in Sect 2.3, to allow future extensions to more general theoretical frameworks. In the following, mathematical formulas are written following notations based on Ide et al. (1997) and Rayner et al. (2019). We present the theoretical basis and several inversion methods that are implemented in the CIF 5 as demonstrators.

Computation modes in the CIF
The present version of the CIF includes three main categories of inversion methods: 1) analytical, i.e. algebraic solution :: of ::: the ::::::: unbiased :::::::: Gaussian :::::::: Bayesian ::::::: problem, 2) ensemble methods with the Ensemble Kalman cally, it operates backwards compared to the observation operator :::::::::::::::: (e.g., Errico, 1997) in the sense that it determines the sensitivity to inputs (e.g. fluxes) given an incremental perturbation to outputs (e.g. concentrations)(e.g., Errico, 1997). In addition to the mentioned data assimilation methods, the CIF also includes the possibility to run forward simulations and to test the adjoint and the tangent linear of the observation operator for given inversion configurations. In the following we call all inversion methods and other types of 30 computation in the CIF "computation modes". With these computation modes implemented in a flexible and general manner, it is anticipated that other inversion methods could be easily added to the CIF in the future (see Sect. 2.3).

Variational inversions
Variational inversions are a numerical approximation to the solution ::::::::: Variational ::::::::: inversions ::: use ::: the ::: fact :::: that :::::: finding ::: the :::: mode : of the inversion problem: they involve the gradient of the cost function in Eq. 5 and require 30 to run forward and adjoint simulations iteratively (e.g., Meirink et al., 2008;Bergamaschi et al., 2010;Houweling et al., 2016Houweling et al., , 2014?;Chevallier . In variational inversions, the solution x is defined as being that with maximum posterior probability. In the case of Gaussian assumptions, it is equivalent to computing the mode x a of the normal distribution . Computing x a ::::::: posterior :::::::: Gaussian :::::::::: distribution :::::::::::::::: p a (x) ∼ N (x a , A) : in Eq.1 ::: (2) : is equivalent to finding the minimum :: x a : of the cost function : J: The variational formulation does not require calculation of complex matrix products and inversions, 5 contrary to the analytical inversion, and is thus not limited by vector dimensions. Still, the inverses of the uncertainty matrices B and R need to be computed, potentially prohibiting the use of very large and/or complex general matrices; this challenge is often overcome by reducing B and R to manageable combinations of simple matrices (e.g., Kronecker products of simple shape covariance matrices; see Sect. 2.3.1).

Auxiliary computation modes
Forward simulations 15 Forward simulations simply use the observation operator to compute simulated observation equivalents. It reads as: This mode is used to make quick comparisons between observations and simulations to check for inconsistencies before running a full inversion. It is also used by the analytical inversion mode to build response 20 functions.

Test of the adjoint
The test of the adjoint is a crucial diagnostic for any inversion system making use of the adjoint of the observation operator. Such a test is typically required after making any edits to the code (to the forward observation operator or its adjoint) before running an inversion. Coding an adjoint is prone to errors and even 25 small errors can have significant impacts on the computation of the gradient of the cost function in Eq. 7 ::: (7).
Thus, one needs to make sure that the adjoint rigorously corresponds to the forward. This test consists in checking the definition of the mathematical adjoint of the observation operator. It writes as follows for a given target vector x and incremental target perturbation δx: DH(x, δx) ::::::: dH x (δx) : is the linearization of the observation operator H at the point x for a given increment 5 δx; it is computed with the tangent linear model, which is the numerical adaptation of DH(x, δx) :::::::: dH x (δx).
In practice, the two terms of the equation are rarely exactly equal. Nevertheless, the difference should never exceed a few times the machine epsilon. Besides, Eq. 13 :::: (13) should be verified for any given target 10 vector and increment. In practice, it is not possible to explicitly verify all possible combinations; but as the result of the test is highly sensitive to any error in the code, it is assumed that a few typical couples (x, δx) are sufficient to certify the validity of the adjoint. sions estimating hyper-parameters through maximum-likelihood or hierarchical Bayesian techniques (e.g., Michalak et al., 2005;Berchet et al., 2014;Ganesan et al., 2014) could be integrated in the CIF by adapting the Gaussian cost function and by implementing a corresponding computation pipeline.
The complexity of the selected elementary transformations spans a wide range, from one-line straightforward codes to computationally expensive and complex code implementation. In small dimensional and/or 5 linear problems, the computation of the observation operator using its Jacobian and matrix products may be computationally expensive, but is in principle rather straightforward to implement. For non-linear and/or high-dimensional problems, these transformations require simplifications and numerous intermediate steps.
For instance, applying matrix products to the error covariance matrix R and B and computing their inverse is easy in small dimensions, but can be limiting in high dimensional problems; for : . ::: For : that reason, the error 10 covariance matrices are often reduced to particular decompositions; for instance, the error covariance matrix on the target vector B is often written as a Kronecker product of multiple spatial and/or temporal covariance matrices of lower dimensions, making matrix products manageable (e.g., Chevallier et al., 2005;Meirink et al., 2008;Yadav and Michalak, 2013).
In any case, the observation operator (see details in Sect. 2.3.2) appears as the center piece of any inversion 15 method.

15
The pyCIF library pyCIF library, written in Python 3, is the practical embodiment of the CIF project. All theoretical operations described in Sect. 2 are computed by this module. It includes inversion computations, pre-and post-processing of CTM inputs and outputs, as well as target and observation vector reprojections, aggregation, etc., as written in Eq. 15 (15) . Python coding standards follow the community standards PEP-8 (python.org/dev/peps/pep-0008/). 20 Test cases (including the ones presented in Sect. 4) are distributed alongside the CIF codes and scripts.
To foster portability and dissemination, a dedicated Docker image is distributed with pyCIF , providing a stable environment to run the system and enabling full reproducibility of the results from one machine to the other.

25
To reflect the theoretical flexibility required in the computation of various inversion methods and observation operators, we made the choice of implementing pyCIF pyCIF following an abstract structure with a variety of so-called Python plugins, which are dynamically constructed and inter-connected depending on the set-up.

Objects and classes in pyCIF pyCIF
General classes of objects emerge from the definition of the abstract structure of the inversion framework. 30 These classes are defined by the data and metadata they carry, as well as by the methods they include and their interaction with other classes. The main classes are the following: SRF and analytical inversions are available (see details in Sect. 2.2); models: interfaces to CTMs; includes generation of input files, executing the code and post-processing outputs; included are a Gaussian model described in Sect. 4 for the demonstration of the system, as well as CHIMERE, LMDZ, FLEXPART, TM5, and STILT, all of which will be described in a 5 dedicated future publication; platforms: deal with specific configurations on different clusters; it includes a standard platform as well as two supercomputers where the CIF was tested; control target vectors: store and apply operations related to the control target vector, including spatial and temporal aggregation, deaggregation, regularization of the target vector;

25
fluxes, fields, meteos meteo-data : fetch, read and write different formats of inputs for CTMs (surface fluxes, 3D fields and meteorological fields respectively) ; so far includes only inputs specific to included CTMs, but will ultimately include standard data streams, such as widely used emission inventories or meteorological fields such as those from ECMWF; World Data Center for Greenhouse Gases so far (https://gaw.kishou.go.jp/), but classical data providers such as ICOS (icos-cp.eu) or ObsPack Masarie et al. (2014) (Masarie et al., 2014) will also be implemented in the CIF; satellite products should also be included in the CIF in the near future as they play a growing role in the community.

5
Details on metadata and operations for each class are given in Supplements, Tab. ?? S1 . Our objective was to design a code that is fully recursive in the sense that modifying some instance of a class does not require to update other classes calling or being called by the modified class. Thus, each class is built so that it only needs internal data, as well as data from the execution level just before and after it, in order to avoid complex dependencies while allowing proper recursive behaviour in building the transformation

Automatic construction of the execution pipe
To translate the principle scheme of Fig. 1, pyCIF pyCIF is not built in a sequential rigid manner. Plugins are interconnected dynamically at the initializing step of pyCIF pyCIF depending on the chosen set-up (see 20 Sect. 3.3 for details on the way to configure the CIF). The main strength of such a programming structure is the independence of all objects in pyCIF pyCIF . They can be implemented separately in a clean manner.
The developer only needs to specify what other objects are required to run the one being developed and pyCIF pyCIF makes the links to the rest. It avoids unexpected impacts elsewhere in the code when modifying or implementing a feature in the system. In the following, we call this top-down relationship in 25 the code a dependency.
For each plugin required in the configuration (primarily the computation mode), pyCIF pyCIF initializes corresponding objects following simple rules. Following dependencies detailed in Tab. ?? S1 , for every object to initialize, pyCIF pyCIF will fetch and initialize required plugins and attach them to the original plugin. If the required plugin is explicitly defined in the configuration, pyCIF pyCIF will fetch this one. 30 In some cases, some plugins can be built on default dependenciesthat , which do not need to be defined explicitly in the configuration file . In that case, the required plugin can be retrieved using default plugin dependencies specified in the code itself and not needed in the configuration.
For instance, in the call graph in Fig. 1, "variationalinversion " (inversion) is a "computation mode" object in pyCIF pyCIF . To execute, it requires a "minimizer" object (CONGRAD, M1QN3, etc.) that is initialized and attached to it. The minimizer requires a "simulator" object (the cost function) that itself will call functions in the "control vector" object and the "observation operator" object. Then the "observation operator" will initialize a pipeline of transformations including running the "model", and so on and so forth.

Definition of configurations in the CIF
In practice, pyCIF pyCIF is configured using a YAML configuration file (yaml.org). This file format was primarily chosen for its flexibility and intuitive implementation of hierarchical parameters. In the YAML language, key words are specified with associated values by the user. Indentations indicate sub-levels of parameters, which makes it a consistent tool with the coding language python. In the following we describe a demonstration case based on a simple implementation of a Gaussian plume dispersion model and simple inversion set-ups. The purpose of this demonstration case is a proof-of-concept of the CIF, with various inversion methods. We comment and compare inversion set-ups and methods for the purpose of the exercise, but conclusions are not made to be generalized to any inversion case study due 20 to the simplicity of our example. The test application with a simple Gaussian plume model allows users to quickly carry out the test cases themselves, even on desktop computers, to familiarize themselves with the system. Nevertheless, the Gaussian plume model is not only relevant for teaching purposes, but also for real applications, as it is used in many inversion studies from the scale of industrial sites with in-situ fixed or mobile measurements (e.g., Kumar et al., 2020;Foster-Wittig et al., 2015;Ars et al., 2017) to the global scale 25 larger scales with satellite measurements to optimize individual clusters of industrial or urban emissions (e.g., Nassar et al., 2017;Wang et al., 2020). Other models implemented in the CIF will be presented in a future paper evaluating the differences when using different transport models with all other elements of the configuration identical. The purpose of such an evaluation is to produce a rigorous inter-comparison exercise identifying the effect of transport errors in inversion systems.

Gaussian plume model
Gaussian plume models approximate real turbulent transport by a stable average Gaussian state (Hanna et al., 1982). Such models are not always suitable to compare with continuous measurements but can be adapted when using observations averaged over time. In the following, we consider the Gaussian plume assumption to be valid for comparing to hourly averaged observations. A simple application of the Gaussian plume 5 model was implemented in the CIF as a testing and training utility. It is computationally easy to run, even on desktop computers. It includes the most basic Gaussian plume equations. In that application, concentrations C at location (x 0 , y 0 , z 0 ) downwind from a source of intensity f at (x 1 , y 1 , z 1 ) are given by: x is the downwind distance between the source and receptor points along the wind axis, y is the distance between the wind axis and the receptor point, ; , v (source, receptor) is the vector linking the source and the receptor point. z is the difference between the source and the receptor altitudes, . u is the vectoral wind speed, withū is the average wind speed in the domain of simulation. < · | · > and ( · × · ) depict the scalar 15 and the vector products respectively. (a, b, c, d) are parameters depending on the Pasquill-Gifford atmospheric vertical stability classes. There are 7 Pasquill-Gifford stability classes, from A extremely unstable (mostly in summer during the afternoon) to G very stable (occurring mostly during nighttime in winter). As the purpose of the demonstration case is primarily to work on coarsely realistic concentration fields, with a computational cost as low as possible, our implementation of the Gaussian plume model does not include 20 any representation of particle reflection on the ground or on the top of the planetary boundary layer.

5
A random noise of 1% of the standard deviation of the forward simulations was added to the truth observations to generate measurements. Prior fluxes and perturbation are generated following the equations below.
Please note that the perturbation of the fluxes is generated using an explicit formula and not a random perturbation with a given covariance matrix. For that reason, we We discuss results with different possible target vector vectors and covariance matrices. This allows us to assess the sensitivity of an inversion method to 10 the resolution and definition of the target vector and corresponding covariance matrix.

Inversion set-ups
The objective of our test case is to prove that our system enables users to easily compare the behaviour of different inversion methods in various configurations. To do so, we conduct three sets of four inversions for the demonstration of our system. Each set includes one analytical inversion, one EnKF-based EnSRF-based inversion and two variational inversions based on M1QN3 and CONGRAD minimization algorithms respec-20 tively. The sequential aspect of the EnKF EnSRF is not implemented in the CIF, hence the comparison with the other inversion methods only includes the random sampling of the target vector distribution to solve Eq. 5. As the computation of posterior uncertainty matrices is not included for variational inversions at this point neither, posterior uncertainties are not discussed in the following (5) .
The three sets of inversions differ by the dimension of the target vector and the spatial correlations of 25 errors. The first set uses a target vector based on a grid of 3 × 3 pixels-aggregated regions or "bands" independent from each other i.e. with no spatial error correlations. The target vectors of the second and third sets are defined at the grid cell's resolution with horizontal isotropic error correlations, following an exponential decay with a horizontal scale of 500 m and 10000 200000 m respectively; the latter case is used to demonstrate that the implementation of correlation lengths is correct as very long correlations are equivalent to having only one spatial scaling factor in the target vector . For all inversion set-ups, the target vectors are defined as constant over time, i.e., only one coefficient per spatial dimension is optimized for the 5 days 5 × 24 hours, computed by the model. In all set-ups, the magnitude of the observation noise used to generate "true" observations is chosen as observation errors in the matrix R for consistency. In all cases, the diagonal terms of the B matrix are set to 100%.
To assess the sensitivity of each set-up to the allocated computational resources, we computed the EnKF EnSRF and the two variational inversions with varying numbers of simulations N . In the case of the EnKF

Results and discussion
Posterior increments for the four inversion methods in the three considered demonstration cases are presented  mation. The color scale of flux increments is the same as in Fig. 4 which represent the true "increments" to be retrieved. In Fig. ?? 8 , we present the evolution of the cost function of Eq. 6 (6) depending on the number of simulations used for each inversion method for the three demonstration cases (see details on the corresponding number of simulations of each inversion methods in Sect. 4.2.2) . The x-axis has been cropped at the origin as the EnKF EnSRF value for small sizes of random ensembles diverges to infinity.
In the case with the target vector aggregated on groups of pixels, all inversion methods converge towards a very similar solution. In this case, the posterior increments reproduce the overall structure of the truth-prior difference, with four local maxima surrounding one local minimum in the center of the domain. However, 5 the aggregated control target vector results in too coarse patterns which are not representative of the actual true-prior difference. In the case with the target vector at the grid's resolution with spatial correlations of 500 m, the fully resolved analytical solution captures all methods capture well the true-prior difference structurewith four maxima surrounding a minimum . However, posterior increments are rather noisy compared to the truth. This is due to the spatial correlations being inconsistent with the smooth perturbation with For the three In all cases, CONGRAD converges towards a cost function value similar to the analytical 20 solution. It also converges at a faster pace than the two other other two methods and, after 20 and 50 simulations in the aggregated and full-resolution cases respectively a limited number of iterations , the convergence rates rate is close to zero, suggesting additional simulations do not provide significant additional information to CONGRAD . The variational inversion with M1QN3 converges towards the analytical solution only in the case of aggregated target vector. The EnKF inversion converges to the analytical solu-25 tion in two of the three cases (aggregated and full-resolution with small correlation lengths). For the two full-resolution cases, M1QN3 converges to a local minimum instead of a global minimum, probably due to numerical artifacts in the algorithms which makes it mistake the global minimum with a local critical point with a gradient close to zero. This issue may be overcome by allowing the algorithm to carry on for more iterations . The convergence rate with M1QN3 is similar to the one for the EnKF inversion for full-resolution Overall, CONGRAD appears to be the most cost-efficient algorithm to estimate the analytical solution in the case of a linear inversion in our very simple demonstration case. Though not as efficient, the EnKF EnSRF method can approximate the analytical solution at reduced cost, but by a reduced cost. By de-35 sign, its computation can easily be parallelized, which can allow a faster computation than CONGRAD when computational resources are available in parallel. M1QN3 proves not as efficient as its CONGRAD counterpart, but contrary to CONGRAD, it can accommodate non-linear cases.
The reduction of uncertainties and posterior uncertainty matrices are shown in Fig. 6 and 7, and equivalents in Supplement. Regarding posterior uncertainties, CONGRAD proves relatively efficient to approximate 5 the analytical solution, especially at the pixel resolution. The variational inversion with Monte Carlo and M1QN3 computations and the inversion with EnSRF are much noisier. Approximating posterior matrices requires a large number of Monte Carlo members and proves very challenging in real-world applications.

Conclusions
We have introduced here a new generic inversion framework that aims at merging existing inversion systems 10 together, in order to share development and maintenance efforts and to foster collaboration on inversion studies. It has been implemented in a way that is fully independent from the inversion configuration: from the application scales, from the species of interest, from the CTM used, from the assumptions for data assimilation, as well as from the practical operations and transformations applied to the data in pre-and postprocessing stages. This framework will prevent redundant developments from participating research groups 15 and will allow for a very diverse range of applications within the same system. New developments will be made in an efficient manner with limited risks of unexpected side effects, and thanks to the generic structure of the code, specific developments for a given application can be directly applied to other applications.
For instance, new inversion methods implemented in the CIF can be directly tested with various transport models. With modern inversion methods moving towards an hybrid paradigm of variational and ensemble 20 methods, the flexibility of the CIF will be a valuable asset as abstract methods can be easily mixed with each other.
We have presented the first version of this Community Inversion Framework (CIF) alongside with its python-dedicated library pyCIF. As a first step, analytical inversions, variational inversions with M1QN3 and CONGRAD, and EnKF EnSRF have been implemented to demonstrate the general applicability of 25 the CIF. The four inversion techniques were tested here on a test case with a Gaussian Plume Model plume model and with observations generated from known "true" fluxes. The impact of spatial correlations and of spatial aggregation, that which drive the shape of the control vectors used in this scientific community, has been illustrated. The analytical inversion is the most accurate approach to retrieve the true fluxes, as expected, followed by variational inversions with the CONGRAD algorithm in our simple linear case. EnKF 30 In our simple case, EnSRF and M1QN3 are generally less accurate in capturing generally take longer to converge towards the true pattern of the fluxesin our examples , even though EnKF EnSRF inversions have the advantage to be fully parallelizable, in contrast to variational inversions, that are sequential by design and therefore harder to parallelize (e.g., Chevallier, 2013). The next step of the CIF is the implementation of a large variety of CTMs. The implementation of new CTMs already interfaced with other inversion systems should not bring particular conceptual challenges as all interface operations are already written in their original inversion system; in most cases, re-arranging 5 existing routines is sufficient to interface a model to the CIF. One particular challenge concerns I/O optimizations: the generation of inputs and the processing of outputs can be time consuming and in some very heavy applications require specific numerical and coding optimizations. The very general formalism of the CIF may hamper the ability of applying these particular optimizations for some models. Best efforts will have to be deployed to take full advantage of advanced I/O and data manipulation libraries in python to limit 10 this weakness. CHIMERE, LMDz, TM5, FLEXPART, and STILT have already been implemented and a sequel paper will evaluate and compare their behaviour in similar inversion set-ups. COSMO-GHG and WRF-CHEM are also planned to be implemented in the near future to widen the developer/user community of the system. The use of various CTMs in identical inversion configurations (inversion method, observation and target vector, 15 consistent interface, etc.) will allow extensive assessments of transport errors in inversions. Despite many past efforts put in inter-comparison exercises, such a level of inter-comparability has never been reached and will be a natural by-product of the CIF in the future. In addition, comparing posterior uncertainties from different inversion methods and set-ups will make it possible to fully assess the consistency of different inversion results. 20 The flexibility of the CIF allows very complex operations to be included easily. They include the use of satellite observations, that will be evaluated in a future paper, inversions using observations of isotopic ratios and optimizing both surface fluxes and source signatures (Thanwerdas et al., 2020) (Thanwerdas et al., 2021) . In addition, even though GHG greenhouse gas studies have been the main motivation behind the development of the CIF, the system will also be tested for multi-species inversions including air pollutants.