• #### Flows on Metric Graphs with General Boundary Conditions

In this note we study the generation of $C_0$-semigroups by first order differential operators on $\mathrm{L}^p (\mathbb{R}_+,\mathbb{C}^{\ell})\times \mathrm{L}^p ([0,1],\mathbb{C}^{m})$ with general boundary conditions. In many cases we are able to characterize the generation property in terms of the invertibility of a matrix associated to the boundary conditions. The abstract results are used to study well-posedness of transport equations on non-compact metric graphs.
GraphTransport equationRankHamiltonianCauchy problemAttentionTotal-Variation regularizationVector measurePermutationVelocity function...
• #### Kepler motion on single-sheet hyperboloid

The classical Kepler-Coulomb problem on the single-sheeted hyperboloid $H^{3}_1$ is solved in the framework of the Hamilton--Jacobi equation. We have proven that all the bounded orbits are closed and periodic. The paths are ellipses or circles for finite motion.
Constant curvatureHamilton-Jacobi equationMajor axisHamiltonianPseudosphereDe Sitter spaceQuantum mechanicsCoherent stateOrdinary differential equationsElliptical orbit...
• #### MLQA: Evaluating Cross-lingual Extractive Question Answeringver. 3

Question answering (QA) models have shown rapid progress enabled by the availability of large, high-quality benchmark datasets. Such annotated datasets are difficult and costly to collect, and rarely exist in languages other than English, making training QA systems in other languages challenging. An alternative to building large monolingual training datasets is to develop cross-lingual systems which can transfer to a target language without requiring training data in that language. In order to develop such systems, it is crucial to invest in high quality multilingual evaluation benchmarks to measure progress. We present MLQA, a multi-way aligned extractive QA evaluation benchmark intended to spur research in this area. MLQA contains QA instances in 7 languages, namely English, Arabic, German, Spanish, Hindi, Vietnamese and Simplified Chinese. It consists of over 12K QA instances in English and 5K in each other language, with each QA instance being parallel between 4 languages on average. MLQA is built using a novel alignment context strategy on Wikipedia articles, and serves as a cross-lingual extension to existing extractive QA datasets. We evaluate current state-of-the-art cross-lingual representations on MLQA, and also provide machine-translation-based baselines. In all cases, transfer results are shown to be significantly behind training-language performance.
Training setF1 scoreMachine translationStatisticsAttentionGrammarEngineeringNatural language inferenceRadial distribution functionsModel selection...
• #### Stable, scalable, decentralized P2P file sharing with non-altruistic peers

P2P systems provide a scalable solution for distributing large files in a network. The file is split into many chunks, and peers contact other peers to collect missing chunks to eventually complete the entire file. The so-called `rare chunk' phenomenon, where a single chunk becomes rare and prevents peers from completing the file, is a threat to the stability of such systems. Practical systems such as BitTorrent overcome this issue by requiring a global search for the rare chunk, which necessitates a centralized mechanism. We demonstrate a new system based on an approximate rare-chunk rule, allowing for completely distributed file sharing while retaining scalability and stability. We assume non-altruistic peers and the seed is required to make only a minimal contribution.
P2pLyapunov functionMarkov processCoherent neutrino scatteringPeer-to-peer networkExtinctionInstabilityPoisson processIntensityCross-correlation function...
• #### Large-Scale Study of Curiosity-Driven Learning

Reinforcement learning algorithms rely on carefully engineering environment rewards that are extrinsic to the agent. However, annotating each environment with hand-designed, dense rewards is not scalable, motivating the need for developing reward functions that are intrinsic to the agent. Curiosity is a type of intrinsic reward function which uses prediction error as reward signal. In this paper: (a) We perform the first large-scale study of purely curiosity-driven learning, i.e. without any extrinsic rewards, across 54 standard benchmark environments, including the Atari game suite. Our results show surprisingly good performance, and a high degree of alignment between the intrinsic curiosity objective and the hand-designed extrinsic rewards of many game environments. (b) We investigate the effect of using different feature spaces for computing prediction error and show that random features are sufficient for many popular RL game benchmarks, but learned features appear to generalize better (e.g. to novel game levels in Super Mario Bros.). (c) We demonstrate limitations of the prediction-based rewards in stochastic setups. Game-play videos and code are at https://pathak22.github.io/large-scale-curiosity/
EmbeddingFeature spaceReinforcement learningOptimizationArchitectureEntropyInferenceStatisticsAutoencoderCompleteness...
• #### Diversity is All You Need: Learning Skills without a Reward Functionver. 6

Intelligent creatures can explore their environments and learn useful skills without supervision. In this paper, we propose DIAYN ('Diversity is All You Need'), a method for learning useful skills without a reward function. Our proposed method learns skills by maximizing an information theoretic objective using a maximum entropy policy. On a variety of simulated robotic tasks, we show that this simple objective results in the unsupervised emergence of diverse skills, such as walking and jumping. In a number of reinforcement learning benchmark environments, our method is able to learn a skill that solves the benchmark task despite never receiving the true task reward. We show how pretrained skills can provide a good parameter initialization for downstream tasks, and can be composed hierarchically to solve complex, sparse reward tasks. Our results suggest that unsupervised discovery of skills can serve as an effective pretraining mechanism for overcoming challenges of exploration and data efficiency in reinforcement learning.
EntropyMutual informationRegularizationReinforcement learningRoboticsInformation theoryStationary distributionOptimizationQ-functionUnsupervised learning...
• #### Visual Reinforcement Learning with Imagined Goalsver. 2

For an autonomous agent to fulfill a wide range of user-specified goals at test time, it must be able to learn broadly applicable and general-purpose skill repertoires. Furthermore, to provide the requisite level of generality, these skills must handle raw sensory input such as images. In this paper, we propose an algorithm that acquires such general-purpose skills by combining unsupervised representation learning and reinforcement learning of goal-conditioned policies. Since the particular goals that might be required at test-time are not known in advance, the agent performs a self-supervised "practice" phase where it imagines goals and attempts to achieve them. We learn a visual representation with three distinct purposes: sampling goals for self-supervised practice, providing a structured transformation of raw sensory inputs, and computing a reward signal for goal reaching. We also propose a retroactive goal relabeling scheme to further improve the sample-efficiency of our method. Our off-policy algorithm is efficient enough to learn policies that operate on raw image observations and goals for a real-world robotic system, and substantially outperforms prior techniques.
Reinforcement learningRoboticsAutoencoderQ-functionLatent variableGround truthHyperparameterMean squared errorEuclidean distanceQ-learning...
• #### Improved Residual Vector Quantization for High-dimensional Approximate Nearest Neighbor Search

Quantization methods have been introduced to perform large scale approximate nearest search tasks. Residual Vector Quantization (RVQ) is one of the effective quantization methods. RVQ uses a multi-stage codebook learning scheme to lower the quantization error stage by stage. However, there are two major limitations for RVQ when applied to on high-dimensional approximate nearest neighbor search: 1. The performance gain diminishes quickly with added stages. 2. Encoding a vector with RVQ is actually NP-hard. In this paper, we propose an improved residual vector quantization (IRVQ) method, our IRVQ learns codebook with a hybrid method of subspace clustering and warm-started k-means on each stage to prevent performance gain from dropping, and uses a multi-path encoding scheme to encode a vector with lower distortion. Experimental results on the benchmark datasets show that our method gives substantially improves RVQ and delivers better performance compared to the state-of-the-art.
QuantizationK-means++South ecliptic poleNearest neighbor searchEntropyOptimizationNP-hard problemHybridizationFeature spaceStar...
• #### Domain-matched Pre-training Tasks for Dense Retrieval

Pre-training on larger datasets with ever increasing model size is now a proven recipe for increased performance across almost all NLP tasks. A notable exception is information retrieval, where additional pre-training has so far failed to produce convincing results. We show that, with the right pre-training setup, this barrier can be overcome. We demonstrate this by pre-training large bi-encoder models on 1) a recently released set of 65 million synthetically generated questions, and 2) 200 million post-comment pairs from a preexisting dataset of Reddit conversations made available by pushshift.io. We evaluate on a set of information retrieval and dialogue retrieval benchmarks, showing substantial improvements over supervised baselines.
Computational linguisticsInformation retrievalTraining setInformation and communication technologiesDistillationNatural language inferenceModel selectionArchitectureCachingOptimization...
• #### An improved lower bound for multicolor Ramsey numbers and the half-multiplicity Ramsey number problemver. 2

The multicolor Ramsey number problem asks, for each pair of natural numbers $\ell$ and $t$, for the largest $\ell$-coloring of a complete graph with no monochromatic clique of size $t$. Recent works of Conlon-Ferber and Wigderson have improved the longstanding lower bound for this problem. We make a further improvement by replacing an explicit graph appearing in their constructions by a random graph. Graphs useful for this construction are exactly those relevant for a problem of Erd\H{o}s on graphs with no large cliques and few large independent sets. We also make some basic observations about this problem.
GraphRandom graphRankLower and upperVector spaceTurán's theoremGalois fieldHomomorphismStirling numbers of the second kindProbability...
• #### Dwarf stellar haloes: a powerful probe of small scale galaxy formation and the nature of dark matter

We use N-body cosmological simulations and empirical galaxy models to study the merger history of dwarf-mass galaxies (with M_halo~10^10 M_Sun). Our input galaxy models describe the stellar mass-halo mass relation, and the galaxy occupation fraction. The number of major and minor mergers depends on the type of dark matter; in particular, minor mergers are greatly suppressed in warm dark matter models. In addition, the number of mergers that bring in stars is strongly dependent on the galaxy occupation model. For example, minor mergers are negligible for stellar halo growth in models with a high mass threshold for galaxy formation (i.e. 10^9.3 M_Sun at z=0). Moreover, this threshold for galaxy formation can also determine the relative difference (if any) between the stellar haloes of satellite and field dwarfs. Using isolated simulations of dwarf-dwarf mergers, we show that the relative frequency of major and minor mergers predict very different stellar haloes: typically, "intermediate" dark matter merger ratios (~1:5) maximise the growth of distant stellar haloes. We discuss the observability of dwarf stellar haloes and find that the surface brightness of these features are incredibly faint. However, when several dwarfs are stacked together models that form particularly rich stellar haloes could be detectable. Finally, we show that stellar streams in the Galactic halo overlapping in phase-space with known dwarf satellites are likely remnants of their stripped stellar haloes. The mere existence of dwarf stellar haloes can already put constraints on some small-scale models, and thus observational probes should be a high priority.
Stellar haloGalaxyGalaxy FormationDark matterDwarf galaxyMilky WayWarm dark matterStellar-to-halo mass relationVirial massStar...
• #### A Cosmological Underdensity Does Not Solve the Hubble Tension

A potential solution to the Hubble tension is the hypothesis that the Milky Way is located near the center of a matter underdensity. We model this scenario through the Lema\^itre-Tolman-Bondi formalism with the inclusion of a cosmological constant ($\Lambda$LTB) and consider a generalized Gaussian parametrization for the matter density profile. We constrain the underdensity and the background cosmology with a combination of data sets: the Pantheon Sample of type Ia supernovae (both the full catalogue and a redshift-binned version of it), a collection of baryon acoustic oscillations data points and the distance priors extracted from the latest Planck data release. The analysis with the binned supernovae suggests a preference for a $-13 \%$ density drop with a size of approximately 300 Mpc, interestingly matching the prediction for the so-called KBC void already identified on the basis of independent analyses using galaxy distributions. The constraints obtained with the full Pantheon Sample are instead compatible with a homogeneous cosmology and we interpret this radically different result as a cautionary tale about the potential bias introduced by employing a binned supernova data set. We quantify the level of improvement on the Hubble tension by analyzing the constraints on the B-band absolute magnitude of the supernovae, which provides the calibration for the local measurements of $H_0$. Since no significant difference is observed with respect to an analogous fit performed with a standard $\Lambda$CDM cosmology, we conclude that the potential presence of a local underdensity does not resolve the tension and does not significantly degrade current supernova constraints on $H_0$.
SupernovaBaryon acoustic oscillationsHubble constant tensionRedshift binsCosmic voidSupernova Type IaCosmic microwave backgroundCosmologyCosmological parametersDensity parameter...
• #### Hints of dark matter-neutrino interactions in Lyman-$\alpha$ data

In this letter we investigate the possibility that dark matter and (massive) neutrinos can interact via a simple, constant cross section. Building on previous numerical efforts, we constrain this model with CMB, BAO and, in particular, Lyman-$\alpha$ data. We find that the latter hint to a significant departure from $\Lambda$CDM, with a preference for an interaction strength about 3$\sigma$ away from zero. We trace the origin of this preference back to the additional tilt that the interacting scenario can imprint on the Lyman-$\alpha$ flux, solving a well-known tension between early-time and Lyman-$\alpha$ probes. Future work including complementary Lyman-$\alpha$ data will be crucial in order to test these results.
Dark matterCold dark matterMatter power spectrumNeutrinoWarm dark matterBaryon acoustic oscillationsCosmic microwave backgroundNeutrino interactionsBaryon Oscillation Spectroscopic SurveyHIRES spectrometer...
• #### Convexity, large charge and the large-N phase diagram of the $\varphi^4$ theory

In this note we discuss the phase space of the O(2N) vector model in the presence of a quadratic and a quartic interaction by writing the large-N effective potential using large charge methods in dimensions 2<D<4 and 4<D<6. Based on a simple discussion of the convexity properties of the grand potential, we find very different behavior in the two regimes: while in 2<D<4, the theory is well-behaved, the model in 4<D<6 leads to a complex CFT in the UV, consistently with earlier results. We also find a new metastable massive phase in the high-energy regime for the theory on the cylinder.
Effective potentialAsymptotic expansionZeta functionScaling dimensionPhase diagramResummationUnitarityCritical pointResurgencePath integral...
• #### Following the flow for large N and large charge

We discuss the O(2N) vector model in three dimensions. While this model flows to the Wilson-Fisher fixed point when fine tuned, working in a double-scaling limit of large N and large charge allows us to study the model away from the critical point and even to follow the RG flow from the UV to the IR. The crucial observation is that the effective potential -- at leading order in N but exact to all orders in perturbation theory -- is the Legendre transform of the grand potential at fixed charge. This allows us to write an effective action and the free energy for generic values of the coupling in a very simple fashion and without evaluating any Feynman diagrams.
Effective potentialCritical pointEffective actionScaling limitFeynman diagramsTorusZeta functionHiggs phaseCurvatureSaddle point...
• #### On exact overlaps for $\mathfrak{gl}(N)$ symmetric spin chains

We study the integrable two-site states of the quantum integrable models solvable by the nested algebraic Bethe ansatz and possessing $\mathfrak{gl}(N)$-invariant R-matrix. We investigate the overlaps between the integrable two-site states and the wave-functions. To find exact derivations for the factorized overlap formulas for the nested integrable systems is a longstanding unsolved problem. In this paper we give a derivation for a large class of the integrable states of the $\mathfrak{gl}(N)$ symmetric spin chain. The first part of the derivation is to calculate recursion relations for the off-shell overlap that uniquely fix it. Using these recursions we prove that the normalized overlaps of the multi-particle states have factorized forms which contain the products of the one-particle overlaps and the ratio of the Gaudin-like determinants. We also show that the previously proposed overlap formulas agree with our general formula.
Monodromy matrixFinal stateEmbeddingBethe ansatzTransfer matrixAutomorphismMonodromyLax operatorPermutationMatrix product states...
• #### $J\bar T$-deformed CFTs as non-local CFTs

Various holographic set-ups in string theory suggest the existence of non-local, UV complete two-dimensional QFTs that possess Virasoro symmetry, in spite of their non-locality. We argue that $J\bar T$-deformed CFTs are the first concrete realisation of such "non-local CFTs", through a detailed analysis of their classical and quantum symmetry algebra. Classically, the symmetries consist of an infinite set of left-moving conformal and affine $U(1)$ transformations that generate a Witt-Kac-Moody algebra, as well as a set of non-local, field-dependent generalizations of right-moving conformal and affine $U(1)$ transformations, whose algebra depends on the chosen basis. Notably, there exists a basis, denoted as the "flowed" representation, in which the right-moving charge algebra is simply Witt-Kac-Moody. At the quantum level, we provide a concrete prescription for constructing the symmetry generators via a combination of the flow equations they satisfy and the Sugawara construction, and use this to explicitly resolve the ordering ambiguities and the quantum corrections to the generators up to second order in the $J\bar T$ coupling parameter. This construction naturally produces the "flowed" generators, whose algebra is Virasoro-Kac-Moody to all orders in the coupling, with the same central extension as that of the undeformed CFT. We use this input to work out the quantum modifications to the "unflowed" generator algebra. A peculiarity of the Virasoro generators we study is that their zero mode does not equal the Hamiltonian, but is a quadratic function of it; this helps reconcile the Virasoro symmetry with the non-locality of the model. We argue that also $T\bar T$-deformed CFTs posses Virasoro symmetry, and discuss the existence of such a symmetry in more general non-local QFTs.
Conformal field theoryHamiltonianZero modeKac-Moody algebraSpectral flowCentral chargeVirasoro algebraQuantum levelClassical limitSymmetry algebra...
• #### Persistent Homology of Graph Embeddingsver. 2

Popular network models such as the mixed membership and standard stochastic block model are known to exhibit distinct geometric structure when embedded into $\mathbb{R}^{d}$ using spectral methods. The resulting point cloud concentrates around a simplex in the first model, whereas it separates into clusters in the second. By adopting the formalism of generalised random dot-product graphs, we demonstrate that both of these models, and different mixing regimes in the case of mixed membership, may be distinguished by the persistent homology of the underlying point distribution in the case of adjacency spectral embedding. Moreover, despite non-identifiability issues, we show that the persistent homology of the support of the distribution and its super-level sets can be consistently estimated. As an application of our consistency results, we provide a topological hypothesis test for distinguishing the standard and mixed membership stochastic block models.
Persistent homologyGraphPoint cloudNetwork modelEmbeddingSpectral method...
• #### ResNet strikes back: An improved training procedure in timm

The influential Residual Networks designed by He et al. remain the gold-standard architecture in numerous scientific publications. They typically serve as the default architecture in studies, or as baselines when new architectures are proposed. Yet there has been significant progress on best practices for training neural networks since the inception of the ResNet architecture in 2015. Novel optimization & data-augmentation have increased the effectiveness of the training recipes. In this paper, we re-evaluate the performance of the vanilla ResNet-50 when trained with a procedure that integrates such advances. We share competitive training settings and pre-trained models in the timm open-source library, with the hope that they will serve as better baselines for future work. For instance, with our more demanding training setting, a vanilla ResNet-50 reaches 80.4% top-1 accuracy at resolution 224x224 on ImageNet-val without extra data or distillation. We also report the performance achieved with popular models with our training procedure.
• #### On quenches to the critical point of the three states Potts model -- Matrix Product State simulations and CFT

Conformal Field Theories (CFTs) have been used extensively to understand the physics of critical lattice models at equilibrium. However, the applicability of CFT calculations to the behavior of the lattice systems in the out-of-equilibrium setting is not entirely understood. In this work, we compare the CFT results of the evolution of the entanglement spectrum after a quantum quench with numerical calculations of the entanglement spectrum of the three-state Potts model using matrix product state simulations. Our results lead us to conjecture that CFT does not describe the entanglement spectrum of the three-state Potts model at long times, contrary to what happens in the Ising model. We thus numerically simulate the out-of-equilibrium behaviour of the Potts model according to the CFT protocol - i.e. by taking a particular product state and "cooling" it, then quenching to the critical point and find that, in this case, the entanglement spectrum is indeed described by the CFT at long times.
Conformal field theoryThree state Potts modelQuenchingEntanglement spectrumMatrix product statesCritical pointLattice (order)CoolingLattice modelIsing model...
• #### The universality of islands outside the horizon

We systematically calculate the quantum extremal surface (QES) associated with Hawking radiation for general $D$-dimensional ($D\geq2$) asymptotically flat (or AdS) eternal black holes using the island formula. By adopting the standard "black hole couples thermal baths" model, we find that a QES exists in the near-horizon region outside the black hole when $c\cdot G_{(D)}$ is smaller enough where $c$ is the central charge of the conformal matter and $G_{(D)}$ the Newton constant. The locations of the QES in these backgrounds are obtained and the late-time radiation entropy saturates the two times of black hole entropy. Finally, we numerically check that the no island configuration exists once $c\cdot G_{(D)}$ exceeds a certain upper bound in two-dimensional generalized dilaton theories (GDT).
• #### Thermalization of Holographic Excited States

We propose a real time holographic framework to study thermalization processes of a family of QFT excited states. The construction builds on Skenderis-van Rees's holographic duals to QFT Schwinger-Keldysh complex-time ordered paths. Thermalization is explored choosing a set of observables $F_n$ which essentially isolate the excited state contribution. Focusing on theories defined on compact manifolds and with excited states defined in terms of Euclidean path integrals, we identify boundary conditions that allow to avoid any number of modes in the initial field state. In the large conformal dimensions regime, we give precise prescriptions on how to compute the observables in terms of bulk geodesics.
GeodesicExcited stateAnti de Sitter spaceManifoldConformal field theoryPath integralScaling dimensionThermalisationComplex geodesicWavefunction...
• #### Topos-Theoretic Approaches to Quantum Theory

This review paper surveys work by Isham, Butterfield, Doering, Landsman, Spitters, Heunen et al. about topos-theoretic analyses of quantum theory. It aims to provide a synthesized account of their various approaches.
Von Neumann algebraBoolean algebraKinematicsQuantum logicQuantum mechanicsLattice (order)HomomorphismPartially ordered setBundleDuality...
• #### QFT without infinities and hierarchy problem

The standard way to do computations in Quantum Field Theory (QFT) often results in the requirement of dramatic cancellations between contributions induced by a "heavy" sector into the physical observables of the "light'' (or low energy) sector - the phenomenon known as "hierarchy problem''. This procedure uses divergent multi-loop Feynman diagrams, their regularisation to handle the UV divergences, and then renormalisation to remove them. At the same time, the ultimate outcome of the renormalisation is the mapping of several finite parameters defining the renormalisable field theory into different observables (e.g. all kinds of particle cross-sections). In this paper, we first demonstrate how to relate the parameters of the theory to observables without running into intermediate UV divergences. Then we go one step further: we show how in theories with different mass scales, all physics of the "light" sector can be computed in a way which does not require dramatic cancellations induced by physics of the "heavy" sector. The existence of such a technique suggests that the "hierarchy problem'' in renormalisable theories is not really physical, but rather an artefact of the conventional procedure to compute correlation functions. If the QFT is defined by the "divergencies-free'' method all fine-tunings in theories with well separated energy scales may be avoided.
Quantum field theoryStandard ModelHierarchy problemUltraviolet divergenceNaturalnessFeynman diagramsPropagatorCosmological constantHiggs boson massTwo-point correlation function...
• #### Generalized entropy production in collisionless plasma flows and turbulence

Collisionless plasmas exhibit nonthermal and anisotropic particle distributions after being energized; as a consequence, they enter a low-entropy state relative to the thermal state. The Vlasov equations predict that in a collisionless plasma with closed boundaries, entropy is formally conserved, along with an infinite set of other Casimir invariants; this provides a seemingly strong constraint that may explain how plasmas maintain low entropy. Nevertheless, entropy is commonly believed to be produced due to phase mixing or nonlinear entropy cascades. The question of whether such anomalous entropy production occurs, and of how to characterize it quantitatively, is a fundamental problem in plasma physics. We construct a new theoretical framework for characterizing entropy production (in a generalized sense) based on a set of ideally conserved "Casimir momenta" derived from the Casimir invariants. The growth of the Casimir momenta relative to the average particle momentum indicates entropy production. We apply this framework to quantify entropy production in particle-in-cell simulations of laminar flows and turbulent flows driven in relativistic plasma, where efficient nonthermal particle acceleration is enabled. We demonstrate that a large amount of anomalous entropy is produced by turbulence despite nonthermal features. The Casimir momenta grow to cover a range of energies in the nonthermal tail of the distribution, and we correlate their growth with spatial structures. These results have implications for reduced modeling of nonthermal particle acceleration and for diagnosing irreversible dissipation in collisionless plasmas such as the solar wind and Earth's magnetosphere.
EntropyCasimir invariantEntropy productionTurbulenceCollisionless plasmaParticle-in-cellMomentum spaceDissipationShear flowAnisotropy...
• #### This Ionization Rate is Just Right: The Impact of Cosmic-Ray Attenuation on the Carbon Cycle Emission in Molecular Clouds

Context: Observations of carbon cycle species, C, C$^+$, CO are commonly used to diagnose gas properties in the interstellar medium but are significantly sensitive to the cosmic-ray ionization rate. However, chemical models commonly assume a constant cosmic-ray ionization rate in the clouds. Aims: We investigate the effect of cosmic-ray attenuation on the emission of carbon cycle species from molecular clouds. Methods: We use a post-processed chemical model of a simulated dense molecular cloud and quantify the variation in both column densities and velocity integrated line emission of the carbon cycle with different cosmic-ray ionization rate models. Results: We find that the deviations in the column density for each of the species can be significant and complex. We show that using a constant ionization rate derived from a mass-weighted average of a physically motivated model, $\zeta_c = 2\times10^{-16}$ s$^{-1}$, can well-reproduce the emission of [C{\sc i}] $^3P_1\rightarrow{^3}P_0$ at 609$\mu$m, $^{12}$CO $(J=1-0)$ emission and the [C{\sc ii}] 158$\mu$m emission of the attenuated cosmic-ray ionization rate model. Conclusions: We conclude by recommending the use of depth-dependent cosmic-ray ionization rate in molecular clouds in multi-line observations, or a tailored constant ionization rates derived from depth-dependent parameterizations.
Cosmic rayIonizationMolecular cloudCNO cycleHI column densityLine emissionDiffuse gasDeutsche ForschungsgemeinschaftAbundance profilePhase diagram...
• #### Implications of turbulence-dependent diffusion on cosmic-ray spectra

The propagation of cosmic rays can be described as a diffusive motion in most galactic environments. High-energy gamma-rays measured by Fermi have allowed inference of a gradient in the cosmic-ray density and spectral energy behavior in the Milky Way, which is not predicted by models. Here, a turbulence-dependent diffusion model is used to probe different types of cosmic-ray diffusion tensors. Crucially, it is demonstrated that the observed gradients can be explained through turbulence-dependent energy-scaling of the diffusion tensor.
TurbulenceCosmic rayDiffusion coefficientMilky WayInner galaxyMomentum diffusionGalactic planeCosmic ray diffusionInferenceHigh energy gamma-ray...
• #### Outflows in the presence of cosmic rays and waves with cooling

Plasma outflow from a gravitational potential well with cosmic rays and self-excited Alfv\'en waves with cooling and wave damping is studied in the hydrodynamics regime. We study outflows in the presence of cosmic ray and Alfv\'en waves including the effect of cooling and wave damping. We seek physically allowable steady-state subsonic-supersonic transonic solutions. We adopted a multi-fluid hydrodynamical model for the cosmic ray plasma system. Thermal plasma, cosmic rays, and self-excited Alfv\'en waves are treated as fluids. Interactions such as cosmic-ray streaming instability, cooling, and wave damping were fully taken into account. We considered one-dimensional geometry and explored steady-state solutions. The model is reduced to a set of ordinary differential equations, which we solved for subsonic-supersonic transonic solutions with given boundary conditions at the base of the gravitational potential well. We find that physically allowable subsonic-supersonic transonic solutions exist for a wide range of parameters. We studied the three-fluid system (considering only forward-propagating Alfv\'en waves) in detail. We examined the cases with and without cosmic ray diffusion separately. Comparisons of solutions with and without cooling and with and without wave damping for the same set of boundary conditions (on density, pressures of thermal gas, cosmic rays and waves) are presented. We also present the interesting case of a four-fluid system (both forward- and backward-propagating Alfv\'en waves are included), highlighting the intriguing relation between different components.
Cosmic rayCoolingTransonicCosmic ray diffusionSpeed of soundDiffusion coefficientSteady stateCritical lineCritical pointTwo-stream instability...
• #### Galactic Streams of Cosmic-ray Electrons and Positronsver. 2

Isotropic diffusion is a key assumption in many models of cosmic-ray electrons and positrons. We find that simulation results imply a critical energy of ~10-1000~GeV above which electrons and positrons can spend their entire lives in streams threading magnetic fields, due to energy losses. This would restrict the number of electron/positron sources contributing at Earth, likely leading to smooth electron/positron spectra, as is observed. For positrons, this could be as few as one, with an enhanced flux that would ease energetics concerns of a pulsar origin of the positron excess, or even zero, bringing dark matter to the fore. We conclude that ideas about electron/positron propagation must be revised and discuss implications for recent AMS-02 data.
PositronCosmic ray electronEarthPulsarInverse ComptonAnisotropyRegularizationCosmic rayRandom FieldInterstellar medium...
• #### Fluid energy cascade rate and kinetic damping: new insight from 3D Landau-fluid simulationsver. 2

Using an exact law for incompressible Hall magnetohydrodynamics (HMHD) turbulence, the energy cascade rate is computed from three-dimensional HMHD-CGL (bi-adiabatic ions and isothermal electrons) and Landau fluid (LF) numerical simulations that feature different intensities of Landau damping over a broad range of wavenumbers, typically $0.05\lesssim k_\perp d_i \lesssim100$. Using three sets of cross-scale simulations where turbulence is initiated at large, medium and small scales, the ability of the fluid energy cascade to "sense" the kinetic Landau damping at different scales is tested. The cascade rate estimated from the exact law and the dissipation calculated directly from the simulation are shown to reflect the role of Landau damping in dissipating energy at all scales, with an emphasis on the kinetic ones. This result provides new prospects on using exact laws for simplified fluid models to analyze dissipation in kinetic simulations and spacecraft observations, and new insights into theoretical description of collisionless magnetized plasmas.
Landau dampingDissipationTurbulenceNumerical simulationMagnetized plasmaIntensityEntropyMagnetosonic waveAstrophysical plasmaMagnetohydrodynamics...
• #### Cosmological Vlasov-Poisson equations for dark matter: Recent developments and connections to selected plasma problems

The cosmic large-scale structures of the Universe are mainly the result of the gravitational instability of initially small density fluctuations in the dark-matter distribution. Dark matter appears to be initially cold and behaves as a continuous and collisionless medium on cosmological scales, with evolution governed by the gravitational Vlasov--Poisson equations. Cold dark matter can accumulate very efficiently at focused locations, leading to a highly non-linear filamentary network with extreme matter densities. Traditionally, investigating the non-linear Vlasov--Poisson equations was typically reserved for massively parallelised numerical simulations. Recently, theoretical progress has allowed us to analyse the mathematical structure of the first infinite densities in the dark-matter distribution by elementary means. We review related advances, as well as provide intriguing connections to classical plasma problems, such as the beam-plasma instability.
Shell crossingDark matterVlasov-Poisson equationPhase spaceNumerical simulationDark Matter Density ProfileStandard perturbation theoryCold dark matterZeldovich approximationCosmology...
• #### An Introduction to Variational Autoencodersver. 3

Variational autoencoders provide a principled framework for learning deep latent-variable models and corresponding inference models. In this work, we provide an introduction to variational autoencoders and some important extensions.
InferenceGenerative modelLatent variableAutoencoderStatistical estimatorNeural networkOptimizationLatent spaceClassificationMarginal likelihood...
• #### Attention Is All You Needver. 5

The dominant sequence transduction models are based on complex recurrent or convolutional neural networks in an encoder-decoder configuration. The best performing models also connect the encoder and decoder through an attention mechanism. We propose a new simple network architecture, the Transformer, based solely on attention mechanisms, dispensing with recurrence and convolutions entirely. Experiments on two machine translation tasks show these models to be superior in quality while being more parallelizable and requiring significantly less time to train. Our model achieves 28.4 BLEU on the WMT 2014 English-to-German translation task, improving over the existing best results, including ensembles by over 2 BLEU. On the WMT 2014 English-to-French translation task, our model establishes a new single-model state-of-the-art BLEU score of 41.8 after training for 3.5 days on eight GPUs, a small fraction of the training costs of the best models from the literature. We show that the Transformer generalizes well to other tasks by applying it successfully to English constituency parsing both with large and limited training data.
AttentionTransformerTransductionArchitectureRecurrent neural networkMachine translationConvolution Neural NetworkTraining setEmbeddingPath length...
• #### Core-collapse, evaporation and tidal effects: the life story of a self-interacting dark matter subhalover. 2

Recently, work on self-interacting dark matter (SIDM) cosmologies has shown that an enormous diversity of dark matter (DM) halo density profiles is possible for a fixed SIDM model, ranging from the development of low-density cores to high-density core-collapsed cusps. The possibility of the growth of high central density in low-mass halos, accelerated if halos are subhalos of larger systems, has intriguing consequences for small-halo searches with substructure lensing. However, following the evolution of $\lesssim 10^8 M_\odot$ subhalos in lens-mass systems ($\sim 10^{13}M_\odot$) is computationally expensive with traditional N-body simulations. In this work, we develop a new hybrid semi-analytical + N-body method to study the evolution of SIDM subhalos with high fidelity, from core formation to core-collapse, in staged simulations. With this method, we are able to capture the evaporation of subhalo particles by interactions with host halo particles, an effect that has not yet been fully explored in the context of subhalo core-collapse. We find three main processes driving subhalo evolution: subhalo internal heat outflow, host-subhalo evaporation, and tidal effects. We conclude that the subhalo central density grows only when the heat outflow outweighs the energy gain from evaporation and tidal heating. Thus, evaporation delays or even disrupts subhalo core-collapse. We map out the parameter space for subhalos to core-collapse, and find that it is nearly impossible to drive core collapse in subhalos in SIDM models with constant cross sections. Any discovery of ultra-compact dark substructures with future substructure lensing observations disfavors SIDM models with constant cross sections, indicating instead the presence of additional degrees of freedom, such as velocity-dependence or dissipation of energy.
Dark matter subhaloCore collapseSelf-interacting dark matterEvaporationDark matterCoolingTidal effectsPeriapsisDynamical frictionCold dark matter...
• #### Updated sensitivity of DUNE in 3+1 scenario with far and near detectors

In this paper we present the updated physics sensitivity of DUNE in presence of a light sterile neutrino with both far and near detectors. In the previous studies, the sensitivities were obtained using the configuration of DUNE as described in the conceptual design report (CDR). In this article, we consider the configuration of DUNE as given in the technical design report (TDR) and study the capability of this experiment to constrain the sterile mixing parameters as well as its capability to measure the standard oscillation parameters in 3+1 scenario. Our results show that in 3+1 scenario, the sensitivity of DUNE to measure the mass hierarchy, octant and CP violation deteriorates if we only consider the far detector. However, a combined analysis of far and near detector improves the sensitivity.
DUNE experimentSterile neutrinoTechnical Design ReportCP violationConceptual Design ReportMixing angleNeutrino oscillation experimentsNeutrino mass hierarchyCP violating phaseFlavour...
• #### Stability and instability results of the Kirchhoff plate equation with delay terms on boundary or dynamical boundary controls

In this paper, we consider two models of the Kirchhoff plate equation, the first one with delay terms on the dynamical boundary controls (see system (1.1) below), and the second one where delay terms on the boundary control are added (see system (1.2) below). For the first system, we prove its well-posedness, strong stability, non-exponential stability, and polynomial stability under a multiplier geometric control condition. For the second one, we prove its well-posedness, strong stability, and exponential stability under the same multiplier geometric control condition. Finally, we give some instability examples of system (1.2) for some choices of delays.
InstabilityExponential stability
• #### A fully data-driven algorithm for accurate shear estimationver. 2

Weak lensing by large-scale structure is a powerful probe of cosmology if the apparent alignments in the shapes of distant galaxies can be accurately measured. We study the performance of a fully data-driven approach, based on MetaDetection, focusing on the more realistic case of observations with an anisotropic PSF. Under the assumption that PSF anisotropy is the only source of additive shear bias, we show how unbiased shear estimates can be obtained from the observed data alone. To do so, we exploit the finding that the multiplicative shear bias obtained with MetaDetection is nearly insensitive to the PSF ellipticity. In practice, this assumption can be validated by comparing the empirical corrections obtained from observations to those from simulated data. We show that our data-driven approach meets the stringent requirements for upcoming space and ground-based surveys, although further optimisation is possible.
ShearedGalaxyAnisotropyEllipticityMultiplicative biasMilky WayWeight functionCosmic shearWeak lensingSurface brightness...
• #### The Epoch of Reionization 21-cm Bispectrum: The impact of light-cone effects and detectabilityver. 2

We study the spherically averaged bispectrum of the 21-cm signal from the Epoch of Reionization (EoR). This metric provides a quantitative measurement of the level of non-Gaussianity of the signal which is expected to be high. We focus on the impact of the light-cone effect on the bispectrum and its detectability with the future SKA-Low telescope. Our investigation is based on a single reionization light-cone model and an ensemble of 50 realisations of the 21-cm signal to estimate the cosmic variance errors. We calculate the bispectrum with a new, optimised direct estimation method, DviSukta which calculates the bispectrum for all possible unique triangles. We find that the light-cone effect becomes important on scales $k_1 \lesssim 0.1\,{\rm Mpc}^{-1}$ where for most triangle shapes the cosmic variance errors dominate. Only for the squeezed limit triangles, the impact of the light-cone effect exceeds the cosmic variance. Combining the effects of system noise and cosmic variance we find that $\sim 3\sigma$ detection of the bispectrum is possible for all unique triangle shapes around a scale of $k_1 \sim 0.2\,{\rm Mpc}^{-1}$, and cosmic variance errors dominate above and noise errors below this length scale. Only the squeezed limit triangles are able to achieve a more than $5\sigma$ significance over a wide range of scales, $k_1 \lesssim 0.8\,{\rm Mpc}^{-1}$. Our results suggest that among all the possible triangle combinations for the bispectrum, the squeezed limit one will be the most measurable and hence useful.
Cosmic varianceBispectrumHydrogen 21 cm lineEpoch of reionizationLight conesSquare Kilometre ArraySignal to noise ratioReionizationNon-GaussianityLine of sight...
• #### Asymptotically rigid mapping class groups II: strand diagrams and nonpositive curvature

In this article, we introduce a new family of groups, called Chambord groups and constructed from braided strand diagrams associated to specific semigroup presentations. It includes the asymptotically rigid mapping class groups previously studied by the authors such as the braided Higman-Thompson groups and the braided Houghton groups. Our main result shows that polycyclic subgroups in Chambord groups are virtually abelian and undistorted.
SubgroupExact sequenceBraid groupMorphismIsomorphismGroupoidIsotopyTorsion subgroupIsometryPermutation...

Many force-gradient explicit symplectic integration algorithms have been designed for the Hamiltonian $H=T (\mathbf{p})+V(\mathbf{q})$ with a kinetic energy $T(\mathbf{p})=\mathbf{p}^2/2$ in the existing references. When the force-gradient operator is appropriately adjusted as a new operator, they are still suitable for a class of Hamiltonian problems $H=K(\mathbf{p},\mathbf{q})+V(\mathbf{q})$ with \emph{integrable} part $K(\mathbf{p},\mathbf{q}) = \sum_{i=1}^{n} \sum_{j=1}^{n}a_{ij}p_ip_j+\sum_{i=1}^{n} b_ip_i$, where $a_{ij}=a_{ij}(\textbf{q})$ and $b_i=b_i(\textbf{q})$ are functions of coordinates $\textbf{q}$. The newly adjusted operator is not a force-gradient operator but is similar to the momentum-version operator associated to the potential $V$. The newly extended (or adjusted) algorithms are no longer solvers of the original Hamiltonian, but are solvers of slightly modified Hamiltonians. They are explicit symplectic integrators with time reversibility and time symmetry. Numerical tests show that the standard symplectic integrators without the new operator are generally poorer than the corresponding extended methods with the new operator in computational accuracies and efficiencies. The optimized methods have better accuracies than the corresponding non-optimized methods. Among the tested symplectic methods, the two extended optimized seven-stage fourth-order methods of Omelyan, Mryglod and Folk exhibit the best numerical performance. As a result, one of the two optimized algorithms is used to study the orbital dynamical features of a modified H\'{e}non-Heiles system and a spring pendulum. These extended integrators allow for integrations in Hamiltonian problems, such as the spiral structure in self-consistent models of rotating galaxies and the spiral arms in galaxies.
HamiltonianMessier 4Symplectic integratorChaosTorusGalaxyPhase spaceTime-reversal symmetrySpiral armSpiral structure...
• #### The discovery of rest-frame UV colour gradients and a diversity of dust morphologies in bright z ~ 7 Lyman-break galaxies

We present deep ALMA dust continuum observations for a sample of luminous ($M_{\rm UV} < -22$) star-forming galaxies at $z \simeq 7$. We detect five of the six sources in the far-infrared (FIR), providing key constraints on the obscured star-formation rate (SFR) and the infrared-excess-$\beta$ (IRX-$\beta$) relation without the need for stacking. Despite the galaxies showing blue rest-frame UV slopes ($\beta \simeq -2$) we find that 35-75 percent of the total SFR is obscured. We find the IRX-$\beta$ relation derived for these $z \simeq 7$ sources is consistent with that found for local star-burst galaxies. Using our relatively high-resolution (FWHM $\simeq 0.7\,{\rm arcsec}$) observations we identify a diversity of dust morphologies in the sample. We find both compact emission that appears offset relative to the unobscured components and extended dust emission that is co-spatial with the rest-frame UV light. In the majority of the sources we detect strong rest-frame UV colour gradients (with up to $\Delta \beta \simeq 0.7$-$1.4$) as probed by the multi-band UltraVISTA ground-based data. The observed redder colours are spatially correlated with the location of the FIR detection. Our results show that, even in bright Lyman-break galaxies at $z \simeq 7$, the majority of the star-formation is actually occurring within the faintest components in the rest-frame UV, which have an obscured fraction of $f_{\rm obs} \ge 0.8$. As well as demonstrating the importance of dust obscured star-formation within the Epoch of Reionization, these observations provide an exciting taster of the rich spatially resolved datasets that will be obtained from JWST and high-resolution ALMA follow-up at these redshifts.
Atacama Large Millimeter ArrayGalaxyLyman break galaxyStar formation rateStar formationMilky WayWide Field Camera 3Epoch of reionizationContinuum emissionFull width at half maximum...
• #### LYRA II: Cosmological dwarf galaxy formation with inhomogeneous Population III enrichment

We present the simulation of a $2\times10^9 M_\odot$ halo mass cosmological dwarf galaxy run to $z=0$ at 4 solar mass gas resolution with resolved supernova feedback. We compare three simple subgrid implementations for the inhomogeneous chemical enrichment from Population III stars and compare them to constraints from Local Group dwarf galaxies. The employed model, LYRA, is a novel high resolution galaxy formation model built for the moving mesh code AREPO, which is marked by a resolved multi-phase interstellar medium, single stars and individual supernova events. The resulting reionization relic is characterized by a short ($<1.5$ Gyr) star formation history that is repeatedly brought to a standstill by violent bursts of feedback. Star formation is reignited for a short duration due to a merger at $z\approx4$ and then again at $z\approx0.2-0$ after sustained gas accretion. Our model $z=0$ galaxy matches the stellar mass, size, stellar kinematics and metallicity relations of Local Group dwarf galaxies well. The dark matter profile does not exhibit a core in any version of the model. We show that the host halo masses of Population III stars affect the assembly history of dwarf galaxies. This manifests itself through the initial gaseous collapse in the progenitor halos, affecting the central density of the stellar component and through the accretion of luminous substructure.
Dwarf galaxyStar formationPopulation IIIVirial massGalaxy FormationInterstellar mediumStarReionizationAccretionSupernova...
• #### Galaxy cluster strong lensing cosmography: cosmological constraints from a sample of regular galaxy clusters

Cluster strong lensing cosmography is a promising probe of the background geometry of the Universe and several studies have emerged, thanks to the increased quality of observations using space and ground-based telescopes. For the first time, we use a sample of five cluster strong lenses to measure the values of cosmological parameters and combine them with those from classical probes. In order to assess the degeneracies and the effectiveness of strong-lensing cosmography in constraining the background geometry of the Universe, we adopt four cosmological scenarios. We find good constraining power on the total matter density of the Universe ($\Omega_{\rm m}$) and the equation of state of the dark energy parameter $w$. For a flat $w$CDM cosmology, we find $\Omega_{\rm m} = 0.30_{-0.11}^{+0.09}$ and $w=-1.12_{-0.32}^{+0.17}$ from strong lensing only. Interestingly, we show that the constraints from the Cosmic Microwave Background (CMB) are improved by factors of 2.5 and 4.0 on $\Omega_{\rm m}$ and $w$, respectively, when combined with our posterior distributions in this cosmological model. In a scenario where the equation of state of dark energy evolves with redshift, the strong lensing constraints are compatible with a cosmological constant (i.e. $w=-1$). In a curved cosmology, our strong lensing analyses can accommodate a large range of values for the curvature of the Universe of $\Omega_{\rm k}=0.28_{-0.21}^{+0.16}$. In all cosmological scenarios, we show that our strong lensing constraints are complementary and in good agreement with measurements from the CMB, baryon acoustic oscillations and Type Ia supernovae. Our results show that cluster strong lensing cosmography is a potentially powerful probe to be included in the cosmological analyses of future surveys.
Strong gravitational lensingCosmologyCosmic microwave backgroundCosmological parametersCosmographyCluster of galaxiesMAssive Cluster SurveyMass distributionCosmological constraintsBaryon acoustic oscillations...
• #### lambeq: An Efficient High-Level Python Library for Quantum NLP

We present lambeq, the first high-level Python library for Quantum Natural Language Processing (QNLP). The open-source toolkit offers a detailed hierarchy of modules and classes implementing all stages of a pipeline for converting sentences to string diagrams, tensor networks, and quantum circuits ready to be used on a quantum computer. lambeq supports syntactic parsing, rewriting and simplification of string diagrams, ansatz creation and manipulation, as well as a number of compositional models for preparing quantum-friendly representations of sentences, employing various degrees of syntax sensitivity. We present the generic architecture and describe the most important modules in detail, demonstrating the usage with illustrative examples. Further, we test the toolkit in practice by using it to perform a number of experiments on simple NLP tasks, implementing both classical and quantum pipelines.
Computational linguisticsQuantum circuitPythonCombinatory categorial grammarOpen sourcePregroupArchitectureQubitPregroup GrammarQuantum machine learning...
• #### Inside an Asymptotically Flat Hairy Black Hole

We study the interior of a recently constructed family of asymptotically flat, charged black holes that develop (charged) scalar hair as one increases their charge at fixed mass. Inside the horizon, these black holes resemble the interior of a holographic superconductor. There are analogs of the Josephson oscillations of the scalar field, and the final Kasner singularity depends very sensitively on the black hole parameters near the onset of the instability. In an Appendix, we give a general argument that Cauchy horizons cannot exist in a large class of stationary black holes with scalar hair.
Black holeHorizonScalar fieldCauchy horizonEvent horizonInstabilityCharged black holeCurvatureSuperconductorProper time...
• #### Writing Scientific Papers in Astronomy

Writing is a vital component of a modern career in astronomical research. Very few researchers, however, receive any training in how to produce high-quality written work in an efficient manner. We present a step-by-step guide to writing in astronomy. We concentrate on how to write scientific papers, and address various aspects including how to crystallise the ideas that underlie the research project, and how the paper is constructed considering the audience and the chosen journal. We also describe a number of grammar and spelling issues that often cause trouble to writers, including some that are particularly hard to master for non-native English speakers. This paper is aimed primarily at Master's and PhD level students who are presented with the daunting task of writing their first scientific paper, but more senior researchers or writing instructors may well find the ideas presented here useful.
GrammarAstronomical researchAstronomy