Recently bookmarked papers

with concepts:
  • This is a tutorial and survey paper on factor analysis, probabilistic Principal Component Analysis (PCA), variational inference, and Variational Autoencoder (VAE). These methods, which are tightly related, are dimensionality reduction and generative models. They asssume that every data point is generated from or caused by a low-dimensional latent factor. By learning the parameters of distribution of latent space, the corresponding low-dimensional factors are found for the sake of dimensionality reduction. For their stochastic and generative behaviour, these models can also be used for generation of new data points in the data space. In this paper, we first start with variational inference where we derive the Evidence Lower Bound (ELBO) and Expectation Maximization (EM) for learning the parameters. Then, we introduce factor analysis, derive its joint and marginal distributions, and work out its EM steps. Probabilistic PCA is then explained, as a special case of factor analysis, and its closed-form solutions are derived. Finally, VAE is explained where the encoder, decoder and sampling from the latent space are introduced. Training VAE using both EM and backpropagation are explained.
    Expectation maximizationLatent spaceInferencePrincipal component analysisLatent variableGaussian distributionCovariance matrixVariational autoencodersMaximum likelihood estimationGenerative model...
  • This tutorial paper presents a didactic treatment of the emerging topic of signal processing on higher-order networks. Drawing analogies from discrete and graph signal processing, we introduce the building blocks for processing data on simplicial complexes and hypergraphs, two common abstractions of higher-order networks that can incorporate polyadic relationships.We provide basic introductions to simplicial complexes and hypergraphs, making special emphasis on the concepts needed for processing signals on them. Leveraging these concepts, we discuss Fourier analysis, signal denoising, signal interpolation, node embeddings, and non-linear processing through neural networks in these two representations of polyadic relational structures. In the context of simplicial complexes, we specifically focus on signal processing using the Hodge Laplacian matrix, a multi-relational operator that leverages the special structure of simplicial complexes and generalizes desirable properties of the Laplacian matrix in graph signal processing. For hypergraphs, we present both matrix and tensor representations, and discuss the trade-offs in adopting one or the other. We also highlight limitations and potential research avenues, both to inform practitioners and to motivate the contribution of new researchers to the area.
    GraphHypergraphSignal processingArchitectureOrientationGraph Neural NetworkLine graphEmbeddingNeural networkLaplacian matrix...
  • Networks provide a meaningful way to represent and analyze complex biological information, but the methodological details of network-based tools are often described for a technical audience. Graphery is a hands-on tutorial webserver designed to help biological researchers understand the fundamental concepts behind commonly-used graph algorithms. Each tutorial describes a graph concept along with executable Python code that visualizes the concept in a code view and a graph view. Graphery tutorials help researchers understand graph statistics (such as degree distribution and network modularity) and classic graph algorithms (such as shortest paths and random walks). Users navigate each tutorial using their choice of real-world biological networks, ranging in scale from molecular interaction graphs to ecological networks. Graphery also allows users to modify the code within each tutorial or write new programs, which all can be executed without requiring an account. Discipline-focused tutorials will be essential to help researchers interpret their biological data. Graphery accepts ideas for new tutorials and datasets that will be shaped by both computational and biological researchers, growing into a community-contributed learning platform. Availability: Graphery is available at https://graphery.reedcompbio.org/.
    GraphPythonProgrammingRandom walkStatisticsDegree distributionEcological networkModularityApplication programming interfaceGene...
  • We consider two dimensional CFT states that are produced by a gravitational path integral. As a first case, we consider a state produced by Euclidean $AdS_2$ evolution followed by flat space evolution. We use the fine grained entropy formula to explore the nature of the state. We find that the naive hyperbolic space geometry leads to a paradox. This is solved if we include a geometry that connects the bra with the ket, a bra-ket wormhole. The semiclassical Lorentzian interpretation leads to CFT state entangled with an expanding and collapsing Friedmann cosmology. As a second case, we consider a state produced by Lorentzian $dS_2$ evolution, again followed by flat space evolution. The most naive geometry also leads to a similar paradox. We explore several possible bra-ket wormholes. The most obvious one leads to a badly divergent temperature. The most promising one also leads to a divergent temperature but by making a projection onto low energy states we find that it has features that look similar to the previous Euclidean case. In particular, the maximum entropy of an interval in the future is set by the de Sitter entropy.
    EntropyWormholeConformal field theoryDe Sitter spaceDilatonDs mesonPartition functionBlack holePath integralDensity matrix...
  • The eigenstate thermalization hypothesis (ETH) explains how closed unitary quantum systems can exhibit thermal behavior in pure states. In this work we examine a recently proposed microscopic model of a black hole in AdS$_2$, the so-called Sachdev-Ye-Kitaev (SYK) model. We show that this model satisfies the eigenstate thermalization hypothesis by solving the system in exact diagonalization. Using these results we also study the behavior, in eigenstates, of various measures of thermalization and scrambling of information. We establish that two-point functions in finite-energy eigenstates approximate closely their thermal counterparts and that information is scrambled in individual eigenstates. We study both the eigenstates of a single random realization of the model, as well as the model obtained after averaging of the random disordered couplings. We use our results to comment on the implications for thermal states of the dual theory, i.e. the AdS$_2$ black hole.
    Sachdev-Ye-Kitaev modelThermalisationEigenstate Thermalization HypothesisTwo-point correlation functionRandom matrix theoryHamiltonianDisorderExpectation ValueBlack holeForm factor...
  • Correlators in conformal field theory are naturally organized as a sum over conformal blocks. In holographic theories, this sum must reorganize into a path integral over bulk fields and geometries. We explore how these two sums are related in the case of a point particle moving in the background of a 3d collapsing black hole. The conformal block expansion is recast as a sum over paths of the first-quantized particle moving in the bulk geometry. Off-shell worldlines of the particle correspond to subdominant contributions in the Euclidean conformal block expansion, but these same operators must be included in order to correctly reproduce complex saddles in the Lorentzian theory. During thermalization, a complex saddle dominates under certain circumstances; in this case, the CFT correlator is not given by the Virasoro identity block in any channel, but can be recovered by summing heavy operators. This effectively converts the conformal block expansion in CFT from a sum over intermediate states to a sum over channels that mimics the bulk path integral.
    Conformal field theoryPath integralTwo-point correlation functionComplex planeBlack holePropagatorMonodromySaddle pointAnti de Sitter spaceThermalisation...
  • We present a first-principles CFT calculation corresponding to the spherical collapse of a shell of matter in three dimensional quantum gravity. In field theory terms, we describe the equilibration process, from early times to thermalization, of a CFT following a sudden injection of energy at time t=0. By formulating a continuum version of Zamolodchikov's monodromy method to calculate conformal blocks at large central charge c, we give a framework to compute a general class of probe observables in the collapse state, incorporating the full backreaction of matter fields on the dual geometry. This is illustrated by calculating a scalar field two-point function at time-like separation and the time-dependent entanglement entropy of an interval, both showing thermalization at late times. The results are in perfect agreement with previous gravity calculations in the AdS$_3$-Vaidya geometry. Information loss appears in the CFT as an explicit violation of unitarity in the 1/c expansion, restored by nonperturbative corrections.
    Conformal field theoryBlack holeMonodromyEntanglement entropyEntanglementUnitarityCentral chargeTwo-point correlation functionThermalisationField theory...
  • The presence of an invisible substructure has previously been detected in the gravitational lens galaxy SDSSJ0946+1006 through its perturbation of the lensed images. Using flexible models for the main halo and the subhalo perturbation to fit the lensed images, we demonstrate that the subhalo has an extraordinarily high central density and steep density slope. The inferred concentration for the subhalo is well above the expected scatter in concentrations for $\Lambda$CDM halos of similar mass. We robustly infer the subhalo's projected mass within 1 kpc to be $\sim 2$-$3.7\times 10^9$M$_\odot$ at $>$95% CL for all our lens models, while the average slope of the subhalo's projected density profile over the radial range 0.75-1.25 kpc is constrained to be steeper than isothermal ($\gamma_{2D} \lesssim -1$). By modeling the subhalo light directly, we infer a conservative upper bound on its luminosity $L_V < 1.2\times 10^8L_\odot$ at 95% CL, which shows that the perturber is dark matter dominated. To compare to $\Lambda$CDM expectations, we analyze subhalos within analogues of lensing galaxies in the Illustris TNG100-1 simulation over many lines of sight, and find hundreds of subhalos that achieve a projected mass within 1 kpc of $\gtrsim 2\times10^9M_\odot$. However, less than 1% of the mock observations yield a log-slope steep enough to be consistent with our lensing models, and they $all$ have stellar masses in excess of that allowed by observations by about an order of magnitude or more. We conclude that the presence of such a dark, highly concentrated subhalo is unexpected in a $\Lambda$CDM universe. Finally, we show that this tension with CDM is not significantly reduced if the perturber is assumed to be a line-of-sight structure, rather than a subhalo.
    Dark matter subhaloLine of sightCold dark matterStellar massGravitational lens galaxyLuminosityGalaxyDark matterSurface brightnessIllustrisTNG simulation...
  • We propose a paradigm for realizing the SYK model within string theory. Using the large $N$ matrix description of $c<1$ string theory, we show that the effective theory on a large number $Q$ of FZZT D-branes in $(p,1)$ minimal string theory takes the form of the disorder averaged SYK model with $J \psi^{p}$ interaction. The SYK fermions represent open strings between the FZZT branes and the ZZ branes that underly the matrix model. The continuum SYK dynamics arises upon taking the large $Q$ limit. We observe several qualitative and quantitative links between the SYK model and $(p,q)$ minimal string theory and propose that the two describe different phases of a single system. We comment on the dual string interpretation of double scaled SYK and on the relevance of our results to the recent discussion of the role of ensemble averaging in holography.
    Sachdev-Ye-Kitaev modelString theoryRandom matrix theoryOpen string theoryPartition functionWorldsheetD-braneConformal field theoryMinimal modelsDisorder...
  • We propose a bootstrap program for CFTs near intersecting boundaries which form a co-dimension 2 edge. We describe the kinematical setup and show that bulk 1-pt functions and bulk-edge 2-pt functions depend on a non-trivial cross-ratio and on the angle between the boundaries. Using the boundary OPE (BOE) with respect to each boundary, we derive two independent conformal block expansions for these correlators. The matching of the two BOE expansions leads to a crossing equation. We analytically solve this equation in several simple cases, notably for a free bulk field, where we recover Feynman-diagrammatic results by Cardy.
    Conformal field theoryBoundary conformal field theoryProgrammingConformal BootstrapKinematicsDirichlet boundary conditionEmbeddingConformal symmetryTwo-point correlation functionOPE coefficients...
  • Urban areas are negatively impacted by Carbon Dioxide (CO2 ) and Nitrogen Oxide (NOx) emissions. In order to achieve a cost-effective reduction of greenhouse gas emissions and to combat climate change, the European Union (EU) introduced an Emissions Trading System (ETS) where organizations can buy or receive emission allowances as needed. The current ETS is a centralized one, consisting of a set of complex rules. It is currently administered at the organizational level and is used for fixed-point sources of pollution such as factories, power plants, and refineries. However, the current ETS cannot efficiently cope with vehicle mobility, even though vehicles are one of the primary sources of CO2 and NOx emissions. In this study, we propose a new distributed Blockchain-based emissions allowance trading system called B-ETS. This system enables transparent and trustworthy data exchange as well as trading of allowances among vehicles, relying on vehicle-to-vehicle communication. In addition, we introduce an economic incentive-based mechanism that appeals to individual drivers and leads them to modify their driving behavior in order to reduce emissions. The efficiency of the proposed system is studied through extensive simulations, showing how increased vehicle connectivity can lead to a reduction of the emissions generated from those vehicles. We demonstrate that our method can be used for full life-cycle monitoring and fuel economy reporting. This leads us to conjecture that the proposed system could lead to important behavioral changes among the drivers
    Carbon dioxideClimateMobilityMarketOxidePoint sourceProgrammingSecurityOpen sourceInterference...
  • This paper presents "Stim", a fast simulator for quantum stabilizer circuits. The paper explains how Stim works and compares it to existing tools. With no foreknowledge, Stim can analyze a distance 100 surface code circuit (20 thousand qubits, 8 million gates, 1 million measurements) in 15 seconds and then begin sampling full circuit shots at a rate of 1 kHz. Stim uses a stabilizer tableau representation, similar to Aaronson and Gottesman's CHP simulator, but with three main improvements. First, Stim improves the asymptotic complexity of deterministic measurement from quadratic to linear by tracking the {\em inverse} of the circuit's stabilizer tableau. Second, Stim improves the constant factors of the algorithm by using a cache-friendly data layout and 256 bit wide SIMD instructions. Third, Stim only uses expensive stabilizer tableau simulation for its first sample. Further samples are collected by using the first sample as a baseline for vectorized batches of Pauli frames propagating through the circuit.
    QubitYoung tableauSparsityCachingSoftwareOptimizationEntropyProgrammingMultidimensional ArrayPauli group...
  • This Letter capitalizes on a unique set of total solar eclipse observations, acquired between 2006 and 2020, in white light, Fe XI 789.2 nm ($\rm T_{fexi}$ = $1.2 \pm 0.1$ MK) and Fe XIV 530.3 nm ($\rm T_{fexiv}$ = $ 1.8 \pm 0.1$ MK) emission, complemented by in situ Fe charge state and proton speed measurements from ACE/SWEPAM-SWICS, to identify the source regions of different solar wind streams. The eclipse observations reveal the ubiquity of open structures, invariably associated with Fe XI emission from $\rm Fe^{10+}$, hence a constant electron temperature, $\rm T_{c}$ = $\rm T_{fexi}$, in the expanding corona. The in situ Fe charge states are found to cluster around $\rm Fe^{10+}$, independently of the 300 to 700 km $\rm s^{-1}$ stream speeds, referred to as the continual solar wind. $\rm Fe^{10+}$ thus yields the fiducial link between the continual solar wind and its $\rm T_{fexi}$ sources at the Sun. While the spatial distribution of Fe XIV emission, from $\rm Fe^{13+}$, associated with streamers, changes throughout the solar cycle, the sporadic appearance of charge states $> \rm Fe^{11+}$, in situ, exhibits no cycle dependence regardless of speed. These latter streams are conjectured to be released from hot coronal plasmas at temperatures $\ge \rm T_{fexiv}$ within the bulge of streamers and from active regions, driven by the dynamic behavior of prominences magnetically linked to them. The discovery of continual streams of slow, intermediate and fast solar wind, characterized by the same $\rm T_{fexi}$ in the expanding corona, places new constraints on the physical processes shaping the solar wind.
    Solar windCoronaEclipsesSolar cycleSolar eclipsesAdvanced Composition ExplorerSunElectron temperatureSolar activityIonization...
  • This article explores the required amount of time series points from a high-speed computer network to accurately estimate the Hurst exponent. The methodology consists in designing an experiment using estimators that are applied to time series addresses resulting from the capture of high-speed network traffic, followed by addressing the minimum amount of point required to obtain in accurate estimates of the Hurst exponent. The methodology addresses the exhaustive analysis of the Hurst exponent considering bias behaviour, standard deviation, and Mean Squared Error using fractional Gaussian noise signals with stationary increases. Our results show that the Whittle estimator successfully estimates the Hurst exponent in series with few points. Based on the results obtained, a minimum length for the time series is empirically proposed. Finally, to validate the results, the methodology is applied to real traffic captures in a high-speed computer network.
    Hurst exponentTime SeriesStatistical estimatorMean squared errorGaussian noiseNetworks...
  • Experience replay lets online reinforcement learning agents remember and reuse experiences from the past. In prior work, experience transitions were uniformly sampled from a replay memory. However, this approach simply replays transitions at the same frequency that they were originally experienced, regardless of their significance. In this paper we develop a framework for prioritizing experience, so as to replay important transitions more frequently, and therefore learn more efficiently. We use prioritized experience replay in Deep Q-Networks (DQN), a reinforcement learning algorithm that achieved human-level performance across many Atari games. DQN with prioritized experience replay achieves a new state-of-the-art, outperforming DQN with uniform replay on 41 out of 49 games.
    Reinforcement learningHyperparameterQ-learningArchitectureSupervised learningBinary numberMultidimensional ArrayData structuresFunction approximationOptimization...
  • Given an image of a target person and an image of another person wearing a garment, we automatically generate the target person in the given garment. At the core of our method is a pose-conditioned StyleGAN2 latent space interpolation, which seamlessly combines the areas of interest from each image, i.e., body shape, hair, and skin color are derived from the target person, while the garment with its folds, material properties, and shape comes from the garment image. By automatically optimizing for interpolation coefficients per layer in the latent space, we can perform a seamless, yet true to source, merging of the garment and target person. Our algorithm allows for garments to deform according to the given body shape, while preserving pattern and material details. Experiments demonstrate state-of-the-art photo-realistic results at high resolution ($512\times 512$).
    OptimizationLatent spaceGenerative Adversarial NetRegion of interestTraining setArchitectureAblationEmbeddingHyperparameterAttention...
  • Let $G$ be an acylindrically hyperbolic group on a $\delta$-hyperbolic space $X$. Assume there exists $M$ such that for any generating set $S$ of $G$, $S^M$ contains a hyperbolic element on $X$. Suppose that $G$ is equationally Noetherian. Then we show the set of the growth rate of $G$ is well-ordered. The conclusion is known for hyperbolic groups, and this is a generalization. Our result applies to all lattices in simple Lie groups of rank-1, and more generally, some family of relatively hyperbolic groups. A potential application is a mapping class group, to which the theorem applies if it is equationally Noetherian.
    Hyperbolic groupSubgroupIsometryHyperbolic SpaceGraphGeodesicIsomorphismLattice (order)RankNormal subgroup...
  • We prove that the topological complexity $\mathrm{TC}(\pi)$ equals $\mathrm{cd}(\pi\times\pi)$ for certain toral relatively hyperbolic groups $\pi$.
    SubgroupHyperbolic groupClassifying spaceCohomologyDiagonal subgroupConjugacy classCohomological dimensionIsomorphismEmpty Lattice ApproximationCohomology with compact support...
  • We present PyQUBO, an open-source, Python library for constructing quadratic unconstrained binary optimizations (QUBOs) from the objective functions and the constraints of optimization problems. PyQUBO enables users to prepare QUBOs or Ising models for various combinatorial optimization problems with ease thanks to the abstraction of expressions and the extensibility of the program. QUBOs and Ising models formulated using PyQUBO are solvable by Ising machines, including quantum annealing machines. We introduce the features of PyQUBO with applications in the number partitioning problem, knapsack problem, graph coloring problem, and integer factorization using a binary multiplier. Moreover, we demonstrate how PyQUBO can be applied to production-scale problems through integration with quantum annealing machines. Through its flexibility and ease of use, PyQUBO has the potential to make quantum annealing a more practical tool among researchers.
    OptimizationHamiltonianIsing modelPythonQuantum annealingEmbeddingProgrammingGraphMultidimensional ArrayQubit...
  • Several proposed models for dark matter posit the existence of self-interaction processes that can impact the shape of dark matter halos, making them more spherical than the ellipsoidal halos of collisionless dark matter. One method of probing the halo shapes, and thus the strength of the dark matter self-interaction, is by measuring the shape of the X-ray gas that traces the gravitational potential in relaxed elliptical galaxies. In this work we identify a sample of 11 relaxed, isolated elliptical galaxies and measure the ellipticity of the gravitating matter using X-ray images from the XMM-Newton and Chandra telescopes. We explore a variety of different mass configurations and find that the dark matter halos of these galaxies have ellipticities around $\epsilon\approx 0.2-0.5$. While we find non-negligible scatter in the ellipticity distribution, our results are consistent with some degree of self-interaction at the scale of $\sigma/m \sim 1$ cm$^2$/g, yet they also remain compatible with a cold dark matter scenario. We additionally demonstrate how our results can be used to directly constrain specific dark matter models and discuss implications for current and future simulations of self-interacting dark matter models.
    EllipticitySelf-interacting dark matterElliptical galaxyDark matterCold dark matterDark matter haloMass distributionDark matter modelHydrostatic equilibriumChandra X-ray Observatory...
  • The massive neutrinos of the Cosmic Neutrino Background (C$\nu$B) are fundamental ingredients of the radiation-dominated early universe and are important non-relativistic probes of the large-scale structure formation in the late universe. The dominant source of anisotropies in the neutrino flux distribution on the sky are highly amplified integrals of metric perturbations encountered during the non-relativistic phase of the C$\nu$B. This paper numerically compares the line-of-sight methods for computing C$\nu$B anisotropies with the Einstein-Boltzmann hierarchy solutions in linear theory for a range of neutrino masses. Angular power spectra are computed that are relevant to a future polarized tritium target run of the PTOLEMY experiment. Correlations between the C$\nu$B sky maps and galactic survey data are derived using line-of-sight techniques and discussed in the context of multi-messenger astrophysics.
    NeutrinoAnisotropyMassive neutrinoCosmic neutrino backgroundLine of sightNeutrino massCosmic microwave backgroundNeutrino captureBoltzmann hierarchyAngular power spectrum...
  • Cosmological scenarios with k-essence are invoked in order to explain the observed late-time acceleration of the universe. These scenarios avoid the need for fine-tuned initial conditions (the "coincidence problem") because of the attractor-like dynamics of the k-essence field \phi. It was recently shown that all k-essence scenarios with Lagrangians p=L(X)/\phi^2, necessarily involve an epoch where perturbations of \phi propagate faster than light (the "no-go theorem"). We carry out a comprehensive study of attractor-like cosmological solutions ("trackers") involving a k-essence scalar field \phi and another matter component. The result of this study is a complete classification of k-essence Lagrangians that admit asymptotically stable tracking solutions, among all Lagrangians of the form p=K(\phi)L(X) . Using this classification, we select the class of models that describe the late-time acceleration and avoid the coincidence problem through the tracking mechanism. An analogous "no-go theorem" still holds for this class of models, indicating the existence of a superluminal epoch. In the context of k-essence cosmology, the superluminal epoch does not lead to causality violations. We discuss the implications of superluminal signal propagation for possible causality violations in Lorentz-invariant field theories.
    CosmologyK-essenceCausalityAttractorNo-go theoremLuminosity functionCoincidence problemField theoryClassificationScalar field...
  • We study the radial acceleration relation (RAR) between the total ($a_{\rm tot}$) and baryonic ($a_{\rm bary}$) centripetal acceleration profiles of central galaxies in the cold dark matter (CDM) paradigm. We analytically show that the RAR is intimately connected with the physics of the quasi-adiabatic relaxation of dark matter in the presence of baryons in deep potential wells. This cleanly demonstrates how a near-universal mean RAR and its scatter emerges in the low-acceleration regime ($10^{-12}\,{\rm m\,s}^{-2}\lesssim a_{\rm bary}\lesssim10^{-10}\,{\rm m\,s}^{-2}$) from an interplay between baryonic feedback processes and the distribution of CDM in dark halos. Our framework allows us to go further and study both higher and lower accelerations in detail, using analytical approximations and a realistic mock catalog of $\sim342,000$ low-redshift central galaxies with $M_r\leq-19$. We show that, while the RAR in the baryon-dominated, high-acceleration regime ($a_{\rm bary}\gtrsim10^{-10}\,{\rm m\,s}^{-2}$) is very sensitive to details of the relaxation physics, a simple `baryonification' prescription matching the relaxation results of hydrodynamical CDM simulations is remarkably successful in reproducing the observed RAR without any tuning. And in the (currently unobserved) ultra-low-acceleration regime ($a_{\rm bary}\lesssim 10^{-12}\,{\rm m\,s}^{-2}$), the RAR is sensitive to the abundance of diffuse gas in the halo outskirts, with our default model predicting a distinctive break from a simple power-law-like relation for HI-deficient, diffuse gas-rich centrals. Our mocks also show that the RAR provides more robust, testable predictions of the $\Lambda$CDM paradigm at galactic scales, with implications for alternative gravity theories, than the baryonic Tully-Fisher relation.
    GalaxyRelaxationBaryonic Tully-Fisher relationRotation CurveDark matterVirial massLuminosityDark Matter Density ProfileDiffuse gasModified Newtonian Dynamics...
  • It was recently shown that the Madelung equations, that is, a hydrodynamic form of the Schr\"odinger equation, can be derived from a canonical ensemble of neural networks where the quantum phase was identified with the free energy of hidden variables. We consider instead a grand canonical ensemble of neural networks, by allowing an exchange of neurons with an auxiliary subsystem, to show that the free energy must also be multivalued. By imposing the multivaluedness condition on the free energy we derive the Schr\"odinger equation with "Planck's constant" determined by the chemical potential of hidden variables. This shows that quantum mechanics provides a correct statistical description of the dynamics of the grand canonical ensemble of neural networks at the learning equilibrium. We also discuss implications of the results for machine learning, fundamental physics and, in a more speculative way, evolutionary biology.
    Neural networkEntropy productionGrand canonical ensembleQuantum mechanicsPlanck's constantEntropyStatistical ensembleInferenceMadelung equationsCanonical ensemble...
  • We calculate the four-wave scattering amplitude in the background of an AdS traversable wormhole in 2+1 dimensions created by a nonlocal coupling of AdS boundaries in the BTZ black hole background. The holographic dual of this setup is a pair of CFTs coupled via a double-trace deformation, the scattering amplitude giving the out-of-time ordered correlation function (OTOC) in CFT. A short-living wormhole exhibits fast chaos (fast scrambling), with the Lyapunov exponent growing linearly with the temperature $T$, and in some cases even reaching the conjectured maximum value $2\pi T$ found in thermal black hole backgrounds. Drastic slowdown of scrambling is obtained for long-living wormholes, where OTOC grows exponentially slowly, eventually droping to zero when the double-trace coupling strength is large enough. Our findings have parallels in strongly coupled disordered field theory models, and may indicate certain limitations of wormhole teleportation protocols previously studied in the literature.
    WormholeAnti de Sitter spaceChaosPropagatorShock waveBlack holeField theoryGeodesicHorizonScattering amplitude...
  • We show that the correlator of three large charge operators with minimal scaling dimension can be computed semiclassically in CFTs with a $U(1)$ symmetry for arbitrary fixed values of the ratios of their charges. We obtain explicitly the OPE coefficient from the numerical solution of a nonlinear boundary value problem in the conformal superfluid EFT in $3d$. The result applies in all three-dimensional CFTs with a $U(1)$ symmetry whose large charge sector is a superfluid.
    OPE coefficientsScaling dimensionConformal field theorySuperfluidSaddle pointPath integralBoundary value problemConformal invarianceScaling limitGoldstone boson...
  • We study the large $N$ matrix model for the index of 4d $\mathcal{N}=4$ Yang-Mills theory and its truncations to understand the dual AdS$_5$ black holes. Numerical studies of the truncated models provide insights on the black hole physics, some of which we investigate analytically with the full Yang-Mills matrix model. In particular, we find many branches of saddle points which describe the known black hole solutions. We analytically construct the saddle points dual to the small black holes whose sizes are much smaller than the AdS radius. They include the asymptotically flat BMPV black holes embedded in large AdS with novel thermodynamic instabilities.
    Black holeSaddle pointRandom matrix theoryAnti de Sitter spaceInstabilityAdS black holeEntropyGravitonDeconfinementBPS black hole...
  • Understanding the entanglement of radiation in QFT has been a long standing challenge in high energy physics, with implications ranging from black hole thermodynamics to quantum information. Progress has been traditionally limited to consideration of either universal quantities fixed by symmetries, or global properties of the asymptotic states. Here we demonstrate how the free fermion in $1+1$-dimensions allows to go beyond by revealing the details of the density matrix of the radiation produced by a moving mirror, that in general breaks all conformal symmetries. We achieve this by using the resolvent method rather than standard CFT techniques, and derive closed expressions for the R\'enyi entropies, modular Hamiltonian and flow of the radiation. We determine the conditions under which mirrors generate unitary transformations, leading to Page curves resembling those expected from black hole evaporation. These results also yield the R\'enyi entropies on AdS$_2$ with reflecting asymptotic boundary conditions, which have applications to recent discussions of Hawking radiation. The results are ready to be used for a variety of applications in the field.
    EntropyChiralityAnti de Sitter spaceRenyi entropyModular HamiltonianMutual informationEntanglementEntanglement entropyConformal symmetryFree fermions...
  • Koopman mode decomposition and tensor component analysis (also known as CANDECOMP/PARAFAC or canonical polyadic decomposition) are two popular approaches of decomposing high dimensional data sets into low dimensional modes that capture the most relevant features and/or dynamics. Despite their similar goal, the two methods are largely used by different scientific communities and formulated in distinct mathematical languages. We examine the two together and show that, under a certain (reasonable) condition on the data, the theoretical decomposition given by tensor component analysis is the \textit{same} as that given by Koopman mode decomposition. This provides a "bridge" with which the two communities should be able to more effectively communicate. When this condition is not met, Koopman mode decomposition still provides a tensor decomposition with an \textit{a priori} computable error, providing an alternative to the non-convex optimization that tensor component analysis requires. Our work provides new possibilities for algorithmic approaches to Koopman mode decomposition and tensor component analysis, provides a new perspective on the success of tensor component analysis, and builds upon a growing body of work showing that dynamical systems, and Koopman operator theory in particular, can be useful for problems that have historically made use of optimization theory.
    OptimizationPrincipal component analysisEigenfunctionOperator theoryColumn vectorLeast squaresPermutationRankTime SeriesExponential function...
  • The understanding of nonlinear, high dimensional flows, e.g, atmospheric and ocean flows, is critical to address the impacts of global climate change. Data Assimilation techniques combine physical models and observational data, often in a Bayesian framework, to predict the future state of the model and the uncertainty in this prediction. Inherent in these systems are noise (Gaussian and non-Gaussian), nonlinearity, and high dimensionality that pose challenges to making accurate predictions. To address these issues we investigate the use of both model and data dimension reduction based on techniques including Assimilation in Unstable Subspaces, Proper Orthogonal Decomposition, and Dynamic Mode Decomposition. Algorithms that take advantage of projected physical and data models may be combined with Data Analysis techniques such as Ensemble Kalman Filter and Particle Filter variants. The projected Data Assimilation techniques are developed for the optimal proposal particle filter and applied to the Lorenz'96 and Shallow Water Equations to test the efficacy of our techniques in high dimensional, nonlinear systems.
    Particle filterData assimilationRankSingular valueCovarianceDimension reductionBayesianFinite differenceClimateSparsity...
  • We propose the operation of \textbf{LEvEL}, the Low-Energy Neutrino Experiment at the LHC, a neutrino detector near the Large Hadron Collider Beam Dump. Such a detector is capable of exploring an intense, low-energy neutrino flux and can measure neutrino cross sections that have previously never been observed. These cross sections can inform other future neutrino experiments, such as those aiming to observe neutrinos from supernovae, allowing such measurements to accomplish their fundamental physics goals. We perform detailed simulations to determine neutrino production at the LHC beam dump, as well as neutron and muon backgrounds. Measurements at a few to ten percent precision of neutrino-argon charged current and neutrino-nucleus coherent scattering cross sections are attainable with 100~ton-year and 1~ton-year exposures at LEvEL, respectively, concurrent with the operation of the High Luminosity LHC. We also estimate signal and backgrounds for an experiment exploiting the forward direction of the LHC beam dump, which could measure neutrinos above 100 GeV.
    NeutrinoLarge Hadron ColliderNeutrino fluxBeam dumpMuonDecay at RestLiquid argon time-projection chamberPionGraphiteSupernova...
  • With a better understanding of the loss surfaces for multilayer networks, we can build more robust and accurate training procedures. Recently it was discovered that independently trained SGD solutions can be connected along one-dimensional paths of near-constant training loss. In this paper, we show that there are mode-connecting simplicial complexes that form multi-dimensional manifolds of low loss, connecting many independently trained models. Inspired by this discovery, we show how to efficiently build simplicial complexes for fast ensembling, outperforming independently trained deep ensembles in accuracy, calibration, and robustness to dataset shift. Notably, our approach only requires a few training epochs to discover a low-loss simplex, starting from a pre-trained solution. Code is available at https://github.com/g-benton/loss-surface-simplexes.
    Neural networkStochastic gradient descentBayesian Model AveragingRegularizationManifoldCalibrationOptimizationArchitectureDeep learningBayesian...
  • We investigate the dust attenuation in both stellar populations and ionized gas in kpc-scale regions in nearby galaxies, using integral field spectroscopy data from MaNGA MPL-9. We identify star-forming (HII) and diffuse ionized gas (DIG) regions from MaNGA datacubes. From the stacked spectrum of each region, we measure the stellar attenuation, $E(B-V)_{\rm star}$, using the technique developed by Li et al.(2020), as well as the gas attenuation, $E(B-V)_{\rm gas}$, from the Balmer decrement. We then examine the correlation of $E(B-V)_{\rm star}$, $E(B-V)_{\rm gas}$, $E(B-V)_{\rm gas}-E(B-V)_{\rm star}$ and $E(B-V)_{\rm star}/E(B-V)_{\rm gas}$ with 16 regional/global properties, and for regions with different $\rm H{\alpha}$ surface brightnesses ($\Sigma_{\rm H\alpha}$). We find a stronger correlation between $E(B-V)_{\rm star}$ and $E(B-V)_{\rm gas}$ in regions of higher $\Sigma_{\rm H\alpha}$. Luminosity-weighted age ($t_L$) is found to be the property that is the most strongly correlated with $E(B-V)_{\rm star}$, and consequently with $E(B-V)_{\rm gas}-E(B-V)_{\rm star}$ and $E(B-V)_{\rm star}/E(B-V)_{\rm gas}$. At fixed $\Sigma_{\rm H\alpha}$, $\log_{10}t_L$ is linearly and negatively correlated with $E(B-V)_{\rm star}/E(B-V)_{\rm gas}$ at all ages. Gas-phase metallicity and ionization level are important for the attenuation in the gas. Our results indicate that the ionizing source for DIG regions is likely distributed in the outer-skirt of galaxies, while for HII regions our results can be well explained by the two-component dust model of Charlot & Fall (2000).
    Dust attenuation curveSDSS-IV survey MaNGAGalaxyStellar agesStarMetallicityIonizationStellar populationsSignal to noise ratioStar formation...
  • We present an analytic model for the splashback mass function of dark matter halos, which is parameterized by a single coefficient and constructed in the framework of the generalized excursion set theory and the self-similar spherical infall model. The value of the single coefficient that quantifies the diffusive nature of the splashback boundary is determined at various redshifts by comparing the model with the numerical results from the Erebos N-body simulations for the Planck and the WMAP7 cosmologies. Showing that the analytic model with the best-fit coefficient provides excellent matches to the numerical results in a wide mass range at all redshifts, we employ the Bayesian Information Criterion test to confirm that our model is most preferred by the numerical results to the previous models at almost all redshifts for both of the cosmologies. It is also found that the diffusion coefficient decreases almost linearly with redshifts, converging to zero at a certain threshold redshift, $z_{c}$, whose value significantly differs between the Planck and WMAP7 cosmologies. Our result implies that the splashback mass function of dark matter halos at $z\ge z_{c}$ is well described by an universal parameter-free analytic formula and that $z_{c}$ may have a potential to independently constrain the initial conditions of the universe.
    Mass functionCosmologyDark matter haloVirial massPlanck missionExcursion set modelDiffusion coefficientBayesian information criterionSplashback radiusN-body simulation...
  • Quantum particles move in strange ways, even when they propagate freely in space. As a result of the uncertainty principle, it is not possible to control the initial conditions of particle emission in such a way that the particle will definitely pass through two precisely defined positions along its path, even if it is possible to line up the two positions with the emitter. However, there is also an upside to the quantum mechanical laws of motion: constructive quantum interferences can actually raise probabilities to values higher than those permitted by classical causality. Here, it is shown that conventional interferometric methods can be used to prepare photons in a quantum state in which a non-vanishing fraction of particles will hit both of two possible targets, even though the direct line-of-sight connecting the two targets is blocked at the source. The demonstration of the effect is complicated by the uncertainty principle because the physical detection of a particle at one target disturbs the motion of the particle, making it impossible to determine whether the initial state of motion would have allowed the particle to hit the other target or not. It is nonetheless possible to determine the minimal fraction of "magic bullet" particles that must have hit both targets by showing that the number of particles hitting target A is larger than the number of particles missing target B. Quantum interference effects can thus be used to optimize the path of particles in free space beyond the classical limit of motion along a straight line.
    Line of sightInterferenceWavefunctionUncertainty principleSuperpositionClassical limitStatisticsCausalityInterferometryTransverse momentum...
  • In this paper, we examine regularity and stability issues for two damped abstract elastic systems. The damping involves the average velocity and a fractional power $\theta$, with $\theta$ in $[-1,1]$, of the principal operator. The matrix operator defining the damping mechanism for the coupled system is degenerate. First, we prove that for $\theta$ in $(1/2,1]$, the underlying semigroup is not analytic, but is differentiable for $\theta$ in $(0,1)$; this is in sharp contrast with known results for a single similarly damped elastic system, where the semigroup is analytic for $\theta$ in $[1/2,1]$; this shows that the degeneracy dominates the dynamics of the interacting systems, preventing analyticity in that range. Next, we show that for $\theta$ in $(0,1/2]$, the semigroup is of certain Gevrey classes. Finally, we show that the semigroup decays exponentially for $\theta$ in $[0,1]$, and polynomially for $\theta$ in $[-1,0)$. To prove our results, we use the frequency domain method, which relies on resolvent estimates. Optimality of our resolvent estimates is also established. Several examples of application are provided.
    Cauchy-Schwarz inequalityComplex numberInfinitesimal generatorAttentionDecay rateEigenfunctionLumer-Phillips theoremUnbounded operatorResolvent setDensely defined operator...
  • In this work, we present a number of generator matrices of the form $[I_{2n} \ | \ \tau_k(v)],$ where $I_{kn}$ is the $kn \times kn$ identity matrix, $v$ is an element in the group matrix ring $M_2(R)G$ and where $R$ is a finite commutative Frobenius ring and $G$ is a finite group of order 18. We employ these generator matrices and search for binary $[72,36,12]$ self-dual codes directly over the finite field $\mathbb{F}_2.$ As a result, we find 134 Type I and 1 Type II codes of this length, with parameters in their weight enumerators that were not known in the literature before. We tabulate all of our findings.
    Matrix ringGroup ringGalois fieldSingly and doubly evenCirculant matrixAttentionSoftwareBlock matrixAutomorphismInfinite group...
  • We propose a new family of polar coding which realizes high coding gain, low complexity, and high throughput by introducing a protograph-based design. The proposed technique called as quasi-cyclic (QC) polar codes can be highly parallelized without sacrificing decoding complexity. We analyze short cycles in the protograph polar codes and develop a design method to increase the girth. Our approach can resolve the long-standing unsolved problem that belief propagation (BP) decoding does not work well for polar codes due to the inherently short cycles. We demonstrate that a high lifting factor of QC polar codes can improve the performance and that QC polar codes with BP decoding can outperform conventional polar codes with state-of-the-art list decoding. Moreover, we show that a greedy pruning method can improve the performance-complexity trade-off.
    PermutationBelief propagationGraphSchedulingInternet of ThingsGalois fieldPermutation matrixSignal to noise ratioAttentionLatency-critical application...
  • Polar codes are the first class of channel codes achieving the symmetric capacity of the binary-input discrete memoryless channels with efficient encoding and decoding algorithms. But the weight spectrum of Polar codes is relatively poor compared to RM codes, which degrades their ML performance. Pre-transformation with an upper-triangular matrix (including cyclic redundancy check (CRC), parity-check (PC) and polarization-adjusted convolutional (PAC) codes), improves weight spectrum while retaining polarization. In this paper, the weight spectrum of upper-triangular pre-transformed Polar Codes is mathematically analyzed. In particular, we focus on calculating the number of low-weight codewords due to their impact on error-correction performance. Simulation results verify the accuracy of the analysis.
    Information setCosetDiffusion Monte CarloCompletenessPermutationBoilingTransformationsSimulationsPolarizationAlgorithms...
  • Causally consistent distributed storage systems have received significant recent attention due to the potential for providing a low latency data access as compared with linearizability. Current causally consistent data stores use partial or full replication to ensure data access to clients over a distributed setting. In this paper, we develop, for the first time, an erasure coding based algorithm called CausalEC that ensures causal consistency for a collection of read-write objects stored in a distributed set of nodes over an asynchronous message passing system. CausalEC can use an arbitrary linear erasure code for data storage, and ensures liveness and storage properties prescribed by the erasure code. CausalEC retains a key benefit of previously designed replication-based algorithms - every write operation is local, that is, a server performs only local actions before returning to a client that issued a write operation. For servers that store certain objects in an uncoded manner, read operations to those objects also return locally. In general, a read operation to an object can be returned by a server on contacting a small subset of other servers so long as the underlying erasure code allows for the object to be decoded from that subset. As a byproduct, we develop EventualEC, a new eventually consistent erasure coding based data storage algorithm. A novel technical aspect of CausalEC is the use of cross-object erasure coding, where nodes encode values across multiple objects, unlike previous consistent erasure coding based solutions. CausalEC navigates the technical challenges of cross-object erasure coding, in particular, pertaining to re-encoding the objects when writes update the values and ensuring that reads are served in the transient state where the system transitions to storing the codeword symbols corresponding to the new object versions.
    ErasureAttentionTerminationAlgorithm designGalois fieldData centerByzantine faultMultidimensional ArrayGoogle.comGraph...
  • As a black hole evaporates, each outgoing Hawking quantum carries away some of the black holes asymptotic charges associated with the extended Bondi-Metzner-Sachs group. These include the Poincar\'e charges of energy, linear momentum, intrinsic angular momentum, and orbital angular momentum or center-of-mass charge, as well as extensions of these quantities associated with supertranslations and super-Lorentz transformations, namely supermomentum, superspin and super center-of-mass charges (also known as soft hair). Since each emitted quantum has fluctuations that are of order unity, fluctuations in the black hole's charges grow over the course of the evaporation. We estimate the scale of these fluctuations using a simple model. The results are, in Planck units: (i) The black hole position has a uncertainty of $\sim M_i^2$ at late times, where $M_i$ is the initial mass (previously found by Page). (ii) The black hole mass $M$ has an uncertainty of order the mass $M$ itself at the epoch when $M \sim M_i^{2/3}$, well before the Planck scale is reached. Correspondingly, the time at which the evaporation ends has an uncertainty of order $\sim M_i^2$. (iii) The supermomentum and superspin charges are not independent but are determined from the Poincare charges and the super center-of-mass charges. (iv) The supertranslation that characterizes the super center-of-mass charges has fluctuations at multipole orders $l$ of order unity that that are of order unity in Planck units. At large $l$, there is a power law spectrum of fluctuations that extends up to $l \sim M_i^2/M$, beyond which the fluctuations fall off exponentially, with corresponding total rms shear tensor fluctuations $\sim M_i M^{-3/2}$.
    Black holeEvaporationEvolution equationPlanck missionBlack hole massBlack hole evaporationShearedGravitational waveOrbital angular momentum of lightInfrared limit...
  • This work develops a new method, based on the use of Gustafson's integrals and on the evaluation of singular integrals, allowing one to establish the unitarity of the separation of variables transform for infinite dimensional representations of rank one quantum integrable models. We examine in detail the case of the $\mathrm{SL}(2,\mathbb R)$ spin chains.
    EigenfunctionUnitaritySeparation of variablesIsometryCompletenessSupport functionHamiltonianPermutationBethe ansatzCompleteness relation...
  • We examine the maximum possible strength of the global 21-cm absorption dip on the Cosmic Background Radiation at high-redshift caused by the atomic intergalactic medium, when the Lyman-$\alpha$ coupling is maximum, assuming no exotic cooling mechanisms from interactions with dark matter. This maximum absorption is limited by three inevitable factors that need to be accounted for: $(a)$ heating by energy transferred from the Cosmic Background Radiation to the hydrogen atoms via 21-cm transitions, dubbed as 21-cm heating; $(b)$ Ly$\alpha$ heating by scatterings of Ly$\alpha$ photons from the first stars; $(c)$ the impact of the expected density fluctuations in the intergalactic gas in standard Cold Dark Matter theory, which reduces the mean 21-cm absorption signal. Inclusion of this third novel effect reduces the maximum global 21-cm absorption by $\sim 10\%$. Overall, the three effects studied here reduce the 21-cm global absorption by $\sim 20\%$ at $z \simeq 17$.
    Hydrogen 21 cm lineIntergalactic mediumKinetic temperatureCosmic background radiationAbsorption featurePopulation IIIStar formationCold dark matterDark matterX-ray heating...
  • With the rapidly increasing integration density and power density in nanoscale electronic devices, the thermal management concerning heat generation and energy harvesting becomes quite crucial. Since phonon is the major heat carrier in semiconductors, thermal transport due to phonons in mesoscopic systems has attracted much attention. In quantum transport studies, the nonequilibrium Green's function (NEGF) method is a versatile and powerful tool that has been developed for several decades. In this review, we will discuss theoretical investigations of thermal transport using the NEGF approach from two aspects. For the aspect of phonon transport, the phonon NEGF method is briefly introduced and its applications on thermal transport in mesoscopic systems including one-dimensional atomic chains, multi-terminal systems, and transient phonon transport are discussed. For the aspect of thermoelectric transport, the caloritronic effects in which the charge, spin, and valley degrees of freedom are manipulated by the temperature gradient are discussed. The time-dependent thermoelectric behavior is also presented in the transient regime within the partitioned scheme based on the NEGF method.
    PhononGreen's functionThermal conductivityThermoelectric powerGraphene nano-ribbonsGrapheneDegree of freedomSelf-energyAttentionSteady state...
  • We reconstruct the neutrino mass as a function of redshift, $z$, from current cosmological data using both standard binned priors and linear spline priors with variable knots. Using cosmic microwave background temperature, polarization and lensing data, in combination with distance measurements from baryonic acoustic oscillations and supernovae, we find that the neutrino mass is consistent with $\sum m_\nu(z)=$ const. We obtain a larger bound on the neutrino mass at low redshifts coinciding with the onset of dark energy domination, $\sum m_\nu(z=0)<1.41$ eV (95% CL). This result can be explained either by the well-known degeneracy between $\sum m_\nu$ and $\Omega_\Lambda$ at low redshifts, or by models in which neutrino masses are generated very late in the Universe. We convert our results into cosmological limits for models with post-recombination neutrino decay and find $\sum m_\nu <0.19$ eV (95% CL), which is below the sensitivity of the KATRIN experiment. Thus, a neutrino mass discovery by KATRIN would hint towards models predicting both post-recombination neutrino mass generation and subsequent relic neutrino annihilation.
    Neutrino massNeutrinoDark energyBaryon acoustic oscillationsCosmological neutrinosRecombinationCosmic microwave backgroundDark RadiationKATRIN experimentCosmological data...
  • We construct 4-dimensional CAT(0) groups containing finitely presented subgroups whose Dehn functions are $\exp^{(n)}(x^m)$ for integers $n, m \geq 1$ and 6-dimensional CAT(0) groups containing finitely presented subgroups whose Dehn functions are $\exp^{(n)}(x^\alpha)$ for integers $n \geq 1$ and $\alpha$ dense in $[1,\infty)$. This significantly expands the known geometric behavior of subgroups of CAT(0) groups.
    SubgroupGraphEmbeddingMonomorphismGeodesicCovering spaceHyperbolic groupSemidirect productWord metricMetric space...
  • A one-by-one exhaustion is a combinatorial/geometric condition which excludes eigenvalues from the spectra of Laplace and Schr\"odinger operators on graphs. Isoperimetric inequalities in graphs with a cocompact automorphism group provide an upper bound on the von Neumann dimension of the space of eigenfunctions. Any finitely generated indicable amenable group has a Cayley graph without eigenvalues.
    GraphEigenfunctionCayley graphDensity of statesFundamental domainMetric spaceAutomorphismMultiplication operatorRandom walkMorphism...
  • A grand challenge in reinforcement learning is intelligent exploration, especially when rewards are sparse or deceptive. Two Atari games serve as benchmarks for such hard-exploration domains: Montezuma's Revenge and Pitfall. On both games, current RL algorithms perform poorly, even those with intrinsic motivation, which is the dominant method to improve performance on hard-exploration domains. To address this shortfall, we introduce a new algorithm called Go-Explore. It exploits the following principles: (1) remember previously visited states, (2) first return to a promising state (without exploration), then explore from it, and (3) solve simulated environments through any available means (including by introducing determinism), then robustify via imitation learning. The combined effect of these principles is a dramatic performance improvement on hard-exploration problems. On Montezuma's Revenge, Go-Explore scores a mean of over 43k points, almost 4 times the previous state of the art. Go-Explore can also harness human-provided domain knowledge and, when augmented with it, scores a mean of over 650k points on Montezuma's Revenge. Its max performance of nearly 18 million surpasses the human world record, meeting even the strictest definition of "superhuman" performance. On Pitfall, Go-Explore with domain knowledge is the first algorithm to score above zero. Its mean score of almost 60k points exceeds expert human performance. Because Go-Explore produces high-performing demonstrations automatically and cheaply, it also outperforms imitation learning work where humans provide solution demonstrations. Go-Explore opens up many new research directions into improving it and weaving its insights into current RL algorithms. It may also enable progress on previously unsolvable hard-exploration problems in many domains, especially those that harness a simulator during training (e.g. robotics).
    SparsityRoboticsHyperparameterNeural networkGraphReinforcement learningTerminationOptimizationConfidence intervalPartially observable Markov decision process...
  • By adhering to the dictum, "No causation without manipulation (treatment, intervention)", cause and effect data analysis represents changes in observed data in terms of changes in the causal factors. When causal factors are not amenable for active manipulation in the real world due to current technological limitations or ethical considerations, a counterfactual approach performs an intervention on the model of data formation. In the case of object representation or activity (temporal object) representation, varying object parts is generally unfeasible whether they be spatial and/or temporal. Multilinear algebra, the algebra of higher-order tensors, is a suitable and transparent framework for disentangling the causal factors of data formation. Learning a part-based intrinsic causal factor representations in a multilinear framework requires applying a set of interventions on a part-based multilinear model. We propose a unified multilinear model of wholes and parts. We derive a hierarchical block multilinear factorization, the M-mode Block SVD, that computes a disentangled representation of the causal factors by optimizing simultaneously across the entire object hierarchy. Given computational efficiency considerations, we introduce an incremental bottom-up computational alternative, the Incremental M-mode Block SVD, that employs the lower-level abstractions, the part representations, to represent the higher level of abstractions, the parent wholes. This incremental computational approach may also be employed to update the causal model parameters when data becomes available incrementally. The resulting object representation is an interpretable combinatorial choice of intrinsic causal factor representations related to an object's recursive hierarchy of wholes and parts that renders object recognition robust to occlusion and reduces training data requirements.
    Causal factorMulti-way arrayTraining setCausalityCirculant matrixLatent variableVector spaceConvolution Neural NetworkOptimizationMultidimensional Array...
  • This paper does not describe a working system. Instead, it presents a single idea about representation which allows advances made by several different groups to be combined into an imaginary system called GLOM. The advances include transformers, neural fields, contrastive representation learning, distillation and capsules. GLOM answers the question: How can a neural network with a fixed architecture parse an image into a part-whole hierarchy which has a different structure for each image? The idea is simply to use islands of identical vectors to represent the nodes in the parse tree. If GLOM can be made to work, it should significantly improve the interpretability of the representations produced by transformer-like systems when applied to vision or language
    EmbeddingNeural networkAttentionArchitectureIntensityConvolution Neural NetworkContrastive learningBumpingDistillationAutoencoder...