Recently bookmarked papers

with concepts:
  • Machine learning classifiers are often trained to recognize a set of pre-defined classes. However, in many applications, it is often desirable to have the flexibility of learning additional concepts, with limited data and without re-training on the full training set. This paper addresses this problem, incremental few-shot learning, where a regular classification network has already been trained to recognize a set of base classes, and several extra novel classes are being considered, each with only a few labeled examples. After learning the novel classes, the model is then evaluated on the overall classification performance on both base and novel classes. To this end, we propose a meta-learning model, the Attention Attractor Network, which regularizes the learning of novel classes. In each episode, we train a set of new weights to recognize novel classes until they converge, and we show that the technique of recurrent back-propagation can back-propagate through the optimization process and facilitate the learning of these parameters. We demonstrate that the learned attractor network can help recognize novel classes while remembering old classes without the need to review the original training set, outperforming various baselines.
    AttentionAttractorMeta learningAttractor networkClassificationOptimizationBackpropagationTraining setLogistic regressionMachine learning...
  • The curse of dimensionality associated with the Hilbert space of spin systems provides a significant obstruction to the study of condensed matter systems. Tensor networks have proven an important tool in attempting to overcome this difficulty in both the numerical and analytic regimes. These notes form the basis for a seven lecture course, introducing the basics of a range of common tensor networks and algorithms. In particular, we cover: introductory tensor network notation, applications to quantum information, basic properties of matrix product states, a classification of quantum phases using tensor networks, algorithms for finding matrix product states, basic properties of projected entangled pair states, and multiscale entanglement renormalisation ansatz states. The lectures are intended to be generally accessible, although the relevance of many of the examples may be lost on students without a background in many-body physics/quantum information. For each lecture, several problems are given, with worked solutions in an ancillary file.
    Quantum informationMatrix product statesCondensed matter systemEntanglementClassificationCurse of dimensionalityQuantum phasesTensorNetworksAlgorithms...
  • The constant introduction of standardized benchmarks in the literature has helped accelerating the recent advances in meta-learning research. They offer a way to get a fair comparison between different algorithms, and the wide range of datasets available allows full control over the complexity of this evaluation. However, for a large majority of code available online, the data pipeline is often specific to one dataset, and testing on another dataset requires significant rework. We introduce Torchmeta, a library built on top of PyTorch that enables seamless and consistent evaluation of meta-learning algorithms on multiple datasets, by providing data-loaders for most of the standard benchmarks in few-shot classification and regression, with a new meta-dataset abstraction. It also features some extensions for PyTorch to simplify the development of models compatible with meta-learning algorithms. The code is available here: https://github.com/tristandeleu/pytorch-meta
    Meta learningClassificationRegressionTraining setGraphSine waveBackpropagationOptimizationSupervised learningReinforcement learning...
  • In this report we review memory-based meta-learning as a tool for building sample-efficient strategies that learn from past experience to adapt to any task within a target class. Our goal is to equip the reader with the conceptual foundations of this tool for building new, scalable agents that operate on broad domains. To do so, we present basic algorithmic templates for building near-optimal predictors and reinforcement learners which behave as if they had a probabilistic model that allowed them to efficiently exploit task structure. Furthermore, we recast memory-based meta-learning within a Bayesian framework, showing that the meta-learned strategies are near-optimal because they amortize Bayes-filtered data, where the adaptation is implemented in the memory dynamics as a state-machine of sufficient statistics. Essentially, memory-based meta-learning translates the hard problem of probabilistic sequential inference into a regression problem.
    Meta learningBayesianStatisticsMonte Carlo methodRegressionInferenceArchitectureGraphDecision makingOptimization...
  • Meta-learning, or learning to learn, is the science of systematically observing how different machine learning approaches perform on a wide range of learning tasks, and then learning from this experience, or meta-data, to learn new tasks much faster than otherwise possible. Not only does this dramatically speed up and improve the design of machine learning pipelines or neural architectures, it also allows us to replace hand-engineered algorithms with novel approaches learned in a data-driven way. In this chapter, we provide an overview of the state of the art in this fascinating and continuously evolving field.
    Meta FeatureOptimizationMeta learningHyperparameterBayesianRankingNeural networkMachine learningArchitectureRank...
  • In this paper, we propose a novel edge-labeling graph neural network (EGNN), which adapts a deep neural network on the edge-labeling graph, for few-shot learning. The previous graph neural network (GNN) approaches in few-shot learning have been based on the node-labeling framework, which implicitly models the intra-cluster similarity and the inter-cluster dissimilarity. In contrast, the proposed EGNN learns to predict the edge-labels rather than the node-labels on the graph that enables the evolution of an explicit clustering by iteratively updating the edge-labels with direct exploitation of both intra-cluster similarity and the inter-cluster dissimilarity. It is also well suited for performing on various numbers of classes without retraining, and can be easily extended to perform a transductive inference. The parameters of the EGNN are learned by episodic training with an edge-labeling loss to obtain a well-generalizable model for unseen low-data problem. On both of the supervised and semi-supervised few-shot image classification tasks with two benchmark datasets, the proposed EGNN significantly improves the performances over the existing GNNs.
    GraphClassificationNeural networkInferenceMeta learningAttentionDeep Neural NetworksGround truthEmbeddingNearest-neighbor site...
  • The goal of few-shot learning is to recognize new visual concepts with just a few amount of labeled samples in each class. Recent effective metric-based few-shot approaches employ neural networks to learn a feature similarity comparison between query and support examples. However, the importance of feature embedding, i.e., exploring the relationship among training samples, is neglected. In this work, we present a simple yet powerful baseline for few-shot classification by emphasizing the importance of feature embedding. Specifically, we revisit the classical triplet network from deep metric learning, and extend it into a deep K-tuplet network for few-shot learning, utilizing the relationship among the input samples to learn a general representation learning via episode-training. Once trained, our network is able to extract discriminative features for unseen novel categories and can be seamlessly incorporated with a non-linear distance metric function to facilitate the few-shot classification. Our result on the miniImageNet benchmark outperforms other metric-based few-shot classification methods. More importantly, when evaluated on completely different datasets (Caltech-101, CUB-200, Stanford Dogs and Cars) using the model trained with miniImageNet, our method significantly outperforms prior methods, demonstrating its superior capability to generalize to unseen classes.
    ClassificationEmbeddingNeural networkNetworks...
  • Conventional methods for object detection usually requires substantial amount of training data and to prepare such high quality training data is labor intensive. In this paper, we propose few-shot object detection which aims to detect objects of unseen class with a few training examples. Central to our method is the Attention-RPN and the multi-relation module which fully exploit the similarity between the few shot training examples and the test set to detect novel objects while suppressing the false detection in background. To train our network, we have prepared a new dataset which contains 1000 categories of varies objects with high quality annotations. To the best of our knowledge, this is also the first dataset specifically designed for few shot object detection. Once our network is trained, we can apply object detection for unseen classes without further training or fine tuning. This is also the major advantage of few shot object detection. Our method is general, and has a wide range of applications. We demonstrate the effectiveness of our method quantitatively and qualitatively on different datasets. The dataset link is: https://github.com/fanq15/Few-Shot-Object-Detection-Dataset.
    Object detectionAttentionTraining setCOCO simulationConvolutional neural networkClassificationMain sequence starRankingEmbeddingTraining Image...
  • In [1, 2], we have explored the theoretical aspects of feature extraction optimization processes for solving largescale problems and overcoming machine learning limitations. Majority of optimization algorithms that have been introduced in [1, 2] guarantee the optimal performance of supervised learning, given offline and discrete data, to deal with curse of dimensionality (CoD) problem. These algorithms, however, are not tailored for solving emerging learning problems. One of the important issues caused by online data is lack of sufficient samples per class. Further, traditional machine learning algorithms cannot achieve accurate training based on limited distributed data, as data has proliferated and dispersed significantly. Machine learning employs a strict model or embedded engine to train and predict which still fails to learn unseen classes and sufficiently use online data. In this chapter, we introduce these challenges elaborately. We further investigate Meta-Learning (MTL) algorithm, and their application and promises to solve the emerging problems by answering how autonomous agents can learn to learn?.
    Machine learningOptimizationMeta learningTraining setSupervised learningLong short term memoryClassificationFeature extractionNeural networkVector space...
  • The field of few-shot learning has recently seen substantial advancements. Most of these advancements came from casting few-shot learning as a meta-learning problem. Model Agnostic Meta Learning or MAML is currently one of the best approaches for few-shot learning via meta-learning. MAML is simple, elegant and very powerful, however, it has a variety of issues, such as being very sensitive to neural network architectures, often leading to instability during training, requiring arduous hyperparameter searches to stabilize training and achieve high generalization and being very computationally expensive at both training and inference times. In this paper, we propose various modifications to MAML that not only stabilize the system, but also substantially improve the generalization performance, convergence speed and computational overhead of MAML, which we call MAML++.
    Meta learningStatisticsHyperparameterOptimizationInferenceArchitectureInstabilityNeural networkEmbeddingTraining set...
  • Resembling the rapid learning capability of human, low-shot learning empowers vision systems to understand new concepts by training with few samples. Leading approaches derived from meta-learning on images with a single visual object. Obfuscated by a complex background and multiple objects in one image, they are hard to promote the research of low-shot object detection/segmentation. In this work, we present a flexible and general methodology to achieve these tasks. Our work extends Faster /Mask R-CNN by proposing meta-learning over RoI (Region-of-Interest) features instead of a full image feature. This simple spirit disentangles multi-object information merged with the background, without bells and whistles, enabling Faster /Mask R-CNN turn into a meta-learner to achieve the tasks. Specifically, we introduce a Predictor-head Remodeling Network (PRN) that shares its main backbone with Faster /Mask R-CNN. PRN receives images containing low-shot objects with their bounding boxes or masks to infer their class attentive vectors. The vectors take channel-wise soft-attention on RoI features, remodeling those R-CNN predictor heads to detect or segment the objects that are consistent with the classes these vectors represent. In our experiments, Meta R-CNN yields the state of the art in low-shot object detection and improves low-shot object segmentation by Mask R-CNN.
    Convolutional neural networkObject detectionRegion of interestMeta learningAttentionCOCO simulationAblationInferenceMain sequence starOptimization...
  • Few-shot classification aims to learn a classifier to recognize unseen classes during training with limited labeled examples. While significant progress has been made, the growing complexity of network designs, meta-learning algorithms, and differences in implementation details make a fair comparison difficult. In this paper, we present 1) a consistent comparative analysis of several representative few-shot classification algorithms, with results showing that deeper backbones significantly reduce the performance differences among methods on datasets with limited domain differences, 2) a modified baseline method that surprisingly achieves competitive performance when compared with the state-of-the-art on both the \miniI and the CUB datasets, and 3) a new experimental setting for evaluating the cross-domain generalization ability for few-shot classification algorithms. Our results reveal that reducing intra-class variation is an important factor when the feature backbone is shallow, but not as critical when using deeper backbones. In a realistic cross-domain evaluation setting, we show that a baseline method with a standard fine-tuning practice compares favorably against other state-of-the-art few-shot learning algorithms.
    ClassificationMeta learningCosine similarityOverfittingStatisticsAttentionCompletenessDavies-Bouldin indexEuclidean distanceData sampling...
  • Few-Shot Learning is the challenge of training a model with only a small amount of data. Many solutions to this problem use meta-learning algorithms, i.e. algorithms that learn to learn. By sampling few-shot tasks from a larger dataset, we can teach these algorithms to solve new, unseen tasks. This document reports my work on meta-learning algorithms for Few-Shot Computer Vision. This work was done during my internship at Sicara, a French company building image recognition solutions for businesses. It contains: 1. an extensive review of the state-of-the-art in few-shot computer vision; 2. a benchmark of meta-learning algorithms for few-shot image classification; 3. the introduction to a novel meta-learning algorithm for few-shot object detection, which is still in development.
    ClassificationMeta learningObject detectionConvolutional neural networkImage ProcessingTraining setLong short term memoryEmbeddingMachine learningConfidence interval...
  • Meta-learning has emerged as an important framework for learning new tasks from just a few examples. The success of any meta-learning model depends on (i) its fast adaptation to new tasks, as well as (ii) having a shared representation across similar tasks. Here we extend the model-agnostic meta-learning (MAML) framework introduced by Finn et al. (2017) to achieve improved performance by analyzing the temporal dynamics of the optimization procedure via the Runge-Kutta method. This method enables us to gain fine-grained control over the optimization and helps us achieve both the adaptation and representation goals across tasks. By leveraging this refined control, we demonstrate that there are multiple principled ways to update MAML and show that the classic MAML optimization is simply a special case of second-order Runge-Kutta method that mainly focuses on fast-adaptation. Experiments on benchmark classification, regression and reinforcement learning tasks show that this refined control helps attain improved results.
    OptimizationMeta learningRunge-Kutta methodsClassificationMidpoint methodRegressionReinforcement learningArchitectureHidden layerOverfitting...
  • In this paper we present an analysis of the geological, meteorological and climatic data recorded in Pieve a Nievole (PT) over 24 years and using this data for the establishment of the research structure called "PAN.R.C. - Pieve a Nievole Research Center" and "NGICS - New study of geological and Italian city". These data are compared to check local variations, long term trends and correlation with maen annual temperature. The ultimate goal of this work is to understand long term climatic changes in this geographic area. The analysis is performed using a statistical approach and a particular care is used to minimize any effect due to prejudices in case of lack of data.
    GraphPerturbation theoryEarthquakeClimateHumidityDew pointAtmospheric pressureInstabilityAttentionSuperposition...
  • I describe a heuristic model where MOND dynamics emerge in a universe viewed as a nearly spherical brane embedded in a higher-dimensional flat space. The brane, described by $\xi(\Omega)$, is of density $\sigma$ ($\xi$ and $\Omega$ are the radial and angular coordinates in the embedding space). The brane and matter -- confined to the brane and of density $\rho(\Omega)\ll\sigma$ -- are coupled to a potential $\varepsilon(\xi)$. I restrict myself to shallow perturbations, $\xi(\Omega)=\ell_0+\zeta(\Omega)$, $|\zeta|\ll\ell_0$. A balanced brane implies $\hat a_0\equiv\varepsilon'(\ell_0)\sim T/\sigma\ell_0$, $T$ is the brane tension, yielding for the velocity of small brane perturbations $c^2\sim T/\sigma\sim \ell_0\hat a_0$. But, $\hat a_0$ plays the role of the MOND acceleration constant in local gravitational dynamics; so $\hat a_0\sim c^2/\ell_0$. What we, in the brane, perceive as the gravitational potential is $\phi\equiv\varepsilon[\xi(\Omega)]\approx \phi_0+\hat a_0\zeta$. Aspects of MOND that may emerge naturally as geometrical properties are: a. The special role of acceleration in MOND, and why it is an acceleration, $a_0$, that marks the transition from the standard dynamics much above $a_0$ to scale-invariant dynamics much below $a_0$. b. The intriguing connection of $a_0$ with cosmology. c. The Newtonian limit corresponds to local departure $|\zeta|\ll\ell_0$; i.e., $\phi-\phi_0\sim a_0\zeta\ll a_0\ell_0\sim c^2$ - whereas relativity enters when $|\zeta|\not\ll\ell_0$. The model also opens new vistas for extension, e.g., it points to possible dependence of $a_0$ on $\phi$, and to $a_0$ losing its status and meaning altogether in the relativistic regime. The required global balance of the brane might solve the `old' cosmological-constant problem. I discuss possible connections with the nearly-de-Sitter nature of our Universe. (Abridged.)
    Modified Newtonian Dynamicsdeep-MOND limitScale invarianceBraneworldEmbeddingCosmological constantGeneral relativityDegree of freedomCosmologyDe Sitter space...
  • By means of interferometic 21-cm observations and a 3D kinematic modeling technique, we study the gas kinematics of six HI-rich ultra-diffuse galaxies (UDGs). We derive robust circular velocities and baryonic masses, that allow us to study the position of our UDGs with respect to the baryonic Tully-Fisher relation (BTFR). Somewhat surprisingly, we find that these galaxies are strong outliers from the BTFR, rotating too slowly for their baryonic mass. Moreover, their position in the circular velocity--baryonic mass plane implies that they have a baryon fraction inside their virial radii compatible with the cosmological mean, meaning that they have no "missing baryons". Unexpectedly, the dynamics of our galaxies are dominated by the baryons, leaving small room for dark matter inside their discs.
    Ultra-diffuse galaxy-like objectGalaxyMilky WayBaryonic Tully-Fisher relationCircular velocityKinematicsInclinationDark matterRotation CurveVelocity dispersion...
  • We investigate the reliability of standard N-body simulations by modelling of the well-known Hernquist halo with the help of \texttt{GADGET-2} code (which uses the tree algorithm to calculate the gravitational force) and \texttt{ph4} code (which uses the direct summation). Comparing the results, we find that the core formation in the halo center (which is conventionally considered as the first sign of numerical effects, to be specific, of the collisional relaxation) has nothing to do with the collisional relaxation, being defined by the properties of the tree algorithm. This result casts doubts on the universally adopted criteria of the simulation reliability in the halo center. Though we use a halo model, which is theoretically proved to be stationary and stable, a sort of numerical 'violent relaxation' occurs. Its properties suggest that this effect is highly likely responsible for the central cusp formation in cosmological modelling of the large-scale structure, and then the 'core-cusp problem' is no more than a technical problem of N-body simulations.
    RelaxationN-body simulationInstabilityViolent relaxationDark matter haloShot noiseSimulations of structure formationDark matterCore-Cusp problemPlanck mission...
  • We show that a thermal relic which decouples from the standard model (SM) plasma while relativistic can be a viable dark matter (DM) candidate, if the decoupling is followed by a period of entropy dilution that heats up the SM, but not the dark sector. Such diluted hot relics can be as light as a keV, while accounting for the entirety of the DM, and not conflicting with cosmological and astrophysical measurements. The requisite dilution can be achieved via decays of a heavy state that dominates the energy budget of the universe in the early matter dominated era.The heavy state decays into the SM particles, heats up the SM plasma, and dilutes the hidden sector. The interaction required to equilibrate the two sectors in the early universe places a bound on the maximum possible dilution as a function of the decoupling temperature. As an example of diluted hot relic DM we consider a light Dirac fermion with a heavy dark photon mediator. We present constraints on the model from terrestrial experiments (current and future), astrophysics, and cosmology.
    Dark matterStandard ModelLight dark matterHidden sectorEntropy dilutionHidden photonThe early UniverseFreeze-outDark sectorDirac fermion...
  • MOND is a paradigm that contends to account for the mass discrepancies in the Universe without invoking `dark' components, such as `dark matter' and `dark energy'. It does so by supplanting Newtonian dynamics and General Relativity, departing from them at very low accelerations. Having in mind historians and philosophers of science, as well as physicists and astronomers, I describe in this review the main aspects of MOND -- its statement, its basic tenets, its main predictions, and the tests of these predictions -- contrasting it with the dark-matter paradigm. I then discuss possible wider ramifications of MOND, for example the potential significance of the MOND constant, $a_0$, and its possible implications for the roots of MOND in cosmology. Along the way I point to parallels with several historical instances of nascent paradigms. In particular, with the emergence of the Copernican world picture, that of quantum physics, and that of relativity. I point to analogies between these paradigms as regards their initial advent, their development, their schematic structure, and their ramifications. For example, the interplay between theories and their corollary laws, and the centrality of a new constant with converging values as deduced from seemingly unrelated manifestations of these laws. I parallel the struggle of the new with the old paradigms, and the appearance of hybrid paradigms at such times of struggle. I also try to identify in the history of those established paradigms a stage that can be likened to that of MOND today.
    Modified Newtonian DynamicsDark matterGalaxyRotation CurveGeneral relativityMilky WayDisk galaxyCosmologyDark energyMOND phenomenology...
  • The evidence is that the mass of the universe is dominated by an exotic nonbaryonic form of matter largely draped around the galaxies. It approximates an initially low pressure gas of particles that interact only with gravity, but we know little more than that. Searches for detection thus must follow many difficult paths to a great discovery, what the universe is made of. The nonbaryonic picture grew out of a convergence of evidence and ideas in the early 1980s. Developments two decades later considerably improved the evidence, and advances since then have made the case for nonbaryonic dark matter compelling.
    Dark matterGalaxyCold dark matterMilky WayCosmologyCosmic microwave backgroundInflationCluster of galaxiesCMB temperature anisotropyCurvature...
  • The 21-cm signal of neutral hydrogen is a sensitive probe of the Epoch of Reionization, Cosmic Dawn and the Dark Ages. Currently operating radio telescopes have ushered in a data-driven era of 21-cm cosmology, providing the first constraints on the astrophysical properties of sources that drive this signal. However, extracting astrophysical information from the data is highly non-trivial and requires the rapid generation of theoretical templates over a wide range of astrophysical parameters. To this end emulators are often employed, with previous efforts focused on predicting the power spectrum. In this work we introduce 21cmGEM -- the first emulator of the global 21-cm signal from Cosmic Dawn and the Epoch of Reionization. The smoothness of the output signal is guaranteed by design. We train neural networks to predict the cosmological signal based on a seven-parameter astrophysical model, using a database of $\sim$30,000 simulated signals. We test the performance with a set of $\sim$2,000 simulated signals, showing that the relative error in the prediction has an r.m.s. of 0.0159. The algorithm is efficient, with a running time per parameter set of 0.16 sec. Finally, we use the database of models to check the robustness of relations between the features of the global signal and the astrophysical parameters that we previously reported. In particular, we confirm the prediction that the coordinates of the maxima of the global signal, if measured, can be used to estimate the Ly{\alpha} intensity and the X-ray intensity at early cosmic times.
    Hydrogen 21 cm lineEpoch of reionizationReionizationCosmic DawnPrincipal component analysisStar formationIntensitySpectral energy distributionStar formation efficiencyIonizing radiation...
  • Recent theoretical work indicates that the neutrino radiation in core-collapse supernovae may be susceptible to flavor instabilities that set in far behind the shock, grow extremely rapidly, and have the potential to profoundly affect supernova dynamics and composition. Here we analyze the nonlinear collective oscillations that are prefigured by these instabilities. We demonstrate that a zero-crossing in $n_{\nu_e} - n_{\bar{\nu}_e}$ as a function of propagation angle is not sufficient to generate instability. Our analysis accounts for this fact and allows us to formulate complementary criteria. Using Fornax simulation data, we show that fast collective oscillations qualitatively depend on how forward-peaked the neutrino angular distributions are.
    InstabilityNeutrinoElectron lepton numberSupernovaCore-collapse supernovaNeutrino oscillationsSupernova dynamicsKinematicsSuperpositionNeutrino flavor...
  • We review the features of Dark Matter as a particle, presenting some old and new instructive models, and looking for their physical implications in the early universe and in the process of structure formation. We also present a schematic of Dark Matter searches and introduce the most promising candidates to the role of Dark Matter particle.
    Dark matterWeakly interacting massive particleDark matter particleNeutrinoAxionStandard ModelSterile neutrinoCosmic rayFreeze-outStructure formation...
  • These notes are based on lectures given by Michael Green during Part III of the Mathematics Tripos (the Certificate for Advanced Study in Mathematics) in the Spring of 2003. The course provided an introduction to string theory, focussing on the Bosonic string, but treating the superstring as well. A background in quantum field theory and general relativity is assumed. Some background in particle physics, group theory and conformal field theory is useful, though not essential. A number of appendices on more advanced topics are also provided, including an introduction to orientifolds in various brane configurations which helps to populate a relatively sparse part of the literature.
    Open string theoryDegree of freedomSupersymmetryString theoryExcited stateNormal orderBosonizationSuperstringBosonic string theoryLight cones...
  • We calculate the Standard Model (SM) predictions for the differential branching ratio of the rare $B_s \to \phi\mu^+ \mu^-$ decays using $B_s \to \phi$ transition form factors (TFFs) obtained using holographic light-front QCD (hQCD) instead of the traditional QCD sum rules (QCDSR) . Our predictions for the differential branching ratio is in better agreement with the LHCb data. Also, we find that the hQCD prediction for $R_{K^*\phi}$, the ratio of the branching fraction of $B \to K^* \mu^+ \mu^-$ to that of $B_s \to \phi\mu^+ \mu^-$ , is in excellent agreement with both the LHCb and CDF results in low $q^2$ range.
    Light frontBranching ratioTransition form factorStandard ModelWavefunctionLHCb experimentLight-cone sum rulesForm factorRare decayVector meson...
  • Recently, the deviation of the ratios $R(D)$, $R(D^{*})$ and $R(J/\psi)$ have been found between experimental data and the Standard Model predictions, which may be the hint of New Physics. In this work, we calculate these ratios within the Standard Model by using the improved instantaneous Bethe-Salpeter method. The emphasis is pad to the relativistic correction of the form factors. The results are $R(D)=0.312 ^{+0.006}_{-0.007}$, $R(D^*)= 0.249^{+0.001}_{-0.002}$, $R(D_s)=0.320 ^{+0.009}_{-0.009}$, $R(D^*_s)=0.251 ^{+0.002}_{-0.003}$, $R(\eta_c)=0.384 ^{+0.032}_{-0.042}$, and $R(J/\psi)=0.267 ^{+0.009}_{-0.011}$, which are consistent with predictions of other models and the experimental data. The semileptonic decay rates and corresponding form factors at zero recoil are also given.
    Form factorStandard ModelRelativistic correctionDecay rateSemileptonic decayExperimental data...
  • The nature of the recently discovered $Z_c$ and $Z_b$ structures is intriguing. Their charge forces its minimal quark content to be $Q\bar Q q\bar q$ (where $Q=\{c,b\}$ and $q=\{u,d\}$). In this work we perform a molecular coupled-channels calculation of the $I^G(J^{PC})=1^+(1^{+-})$ charm and bottom sectors in the framework of a constituent quark model which satisfactorily describes a wide range of properties of (non-)conventional hadrons containing heavy quarks. All the relevant channels are included for each sector, i.e.: The $D^{(\ast)}\bar D^{\ast}+h.c.$, $\pi J/\psi$ and $\rho\eta_c$ channels for the $Z_c$ and $B^{(\ast)}B^{\ast}$ and $\Upsilon(nS)\pi$ ($n=1,2,3$) channels for the $Z_b$ analysis. Possible structures of these resonances will be discussed.
    Constituent quarkQuantum chromodynamicsHeavy quarkQuark massBound stateInvariant massConfinementKinematicsLight quarkReduced mass...
  • Reconstruction of the $B^0 \to D^{*-} \tau^+ \nu_{\tau}$ angular distribution is complicated by the strongly-biasing effect of losing the neutrino information from both the $B$ and $\tau$ decays. In this work, a novel method for making unbiased measurements of the angular coefficients while preserving the model independence of the angular technique is demonstrated. The twelve angular functions that describe the signal decay, in addition to background terms, are modelled in a multidimensional fit, using template probability density functions that encapsulate all resolution and acceptance effects. Sensitivities at the LHCb and Belle II experiments are estimated, and sources of systematic uncertainty are discussed, notably in the extrapolation to a measurement of $R(D^{*})$.
    LHCb experimentStandard ModelNeutrinoStatistical errorSystematic errorForm factorBELLE IIKinematicsInvariant massDecay vertex...
  • A few massive ($M_{*} > 10^8 M_{\odot}$), high-redshift ($z = 8-10$) galaxies have recently been discovered to contain stars with ages of several hundred million years, pushing the onset of star formation in these galaxies back to $z\sim15$. The very existence of stars formed so early may serve as a test for cosmological models with little small scale power (and, hence, late formation of cosmic structure). We explore the ages of oldest stars in numerical simulations from the Cosmic Reionization On Computers (CROC) project with Cold Dark Matter (CDM) and two Warm Dark Matter (WDM) cosmologies with 3 keV and 6 keV particles. There are statistically significant differences between average stellar ages of massive galaxies in CDM and 3 keV WDM. However, these differences are much smaller than both the quoted uncertainties in observational data on the ages of galaxies at these redshifts, and the systematic uncertainties in simulation predictions of these ages as assessed by a convergence test. Further theoretical progress will be needed to refine simulation predictions to an accuracy that would enable dark matter particle physics constraints from this probe.
    Warm dark matterCosmic Reionization On ComputersCold dark matterGalaxyCosmologyMassive galaxiesStar formationOf starsStar formation historiesStar...
  • We obtain a system of identities relating boundary coefficients and spectral data for the one-dimensional Schr\"{o}dinger equation with boundary conditions containing rational Herglotz--Nevanlinna functions of the eigenvalue parameter. These identities can be thought of as a kind of mini version of the Gelfand--Levitan integral equation for boundary coefficients only.
    Boundary value problemInverse problemsParseval's identityEigenfunctionMeromorphic functionQuantum computationFluid dynamicsString theorySelf-adjoint operatorEigenvalue...
  • The purpose of this note is to use the results and methods of our previous work with Bourgain to obtain control and observability by rough functions and sets on rectangular 2-tori. We show that any Lebesgue measurable set of positive measure can be used for observability for the Schroedinger equation. This leads to controllability with rough localization and control functions. For non-empty open sets this follows from the results of Haraux '89 and Jaffard '89 while for square tori and sufficiently long times this can be deduced from the results of Jakobson '97.
    TorusWave equationDilute magnetic semiconductorsHille-Yosida theoremEigenfunctionClosed graph theoremManifoldRadon-Nikodym theoremErgodicitySelf-adjoint operator...
  • This paper studies decay rates for the energy of solutions of the damped wave equation on the torus. It considers dampings invariant in one direction and equal to a sum of squares of nonnegative functions with some number of derivatives. If such a damping vanishes only on a small enough strip then the energy decays at rate $1/t^{3/4}$.The proof uses a positive commutator argument and relies on a pseudodifferential calculus for low regularity symbols.
    Decay rateWave equationNonnegativeTorusParametrixGeodesicPhase spaceManifoldAttentionBounded operator...
  • We discuss parabolic versions of Euler's identity e^{it}=cos t + i sin t. A purely algebraic approach based on dual numbers is known to produce a very trivial relation e^{pt} = 1+pt. Therefore we use a geometric setup of parabolic rotations to recover the corresponding non-trivial algebraic framework. Our main tool is Moebius transformations which turn out to be closely related to induced representations of the group SL(2,R). Keywords: complex numbers, dual numbers, double numbers, linear algebra, invariant, computer algebra, GiNaC
    Induced representationEuler's identityComplex numberKeyphraseMöbius transformationImaginary numberEllipticityAlgebraTransformationsLinear algebra...
  • Following the coalescence of binary neutron stars, debris from the merger which remains marginally bound to the central compact remnant will fallback at late times, feeding a sustained accretion flow. Unbound winds or a wide-angle jet from this radiatively-inefficient disk may collide with the comparatively slow dense kilonova ejecta released from an earlier phase. Under the assumption that such interaction accelerate cosmic rays to ultra-high energies, we numerically simulate their propagation and interactions through the dynamical ejecta. The hadronuclear and photo-hadronic processes experienced by particles produce isotropic high-energy neutrino fluxes, peaking at times $10^{3-4}\,$s, which we calculate for two sets of parameters. A first set is inspired by the observations of GW170817. In the second scenario, which we call optimistic, parameters are chosen so as to optimize the neutrino flux, within the range allowed by observation and theory. We find that single sources can only be detected with IceCube-Gen2 for optimistic scenarios and if located within $\sim 4\,$Mpc. The cumulative flux could contribute to $\sim 0.5-10\%$ of the diffuse flux observed by the IceCube Observatory, depending on the fall-back power and the cosmic ray composition. The neutrino emission powered by fallback is nearly isotropic, and can be used for future correlation studies with gravitational wave signals.
    EjectaNeutrinoNeutrino fluxIceCube Neutrino ObservatoryCosmic rayGW170817LuminosityNeutron starBinary neutron starPion...
  • The models based on $SU(3)_C\times SU(3)_L\times U(1)_X$ gauge symmetry (331-models) have been advocated to explain the number of fermion families. These models place one quark family to a different representation than the other two. The traditional 331-models are plagued by scalar mediated quark flavour changing neutral currents (FCNC) at tree-level. So far there has been no concrete mechanisms to suppress these FCNCs in 331-models. Recently it has been shown that the Froggatt-Nielsen mechanism can be incorporated into the 331-setting in an economical fashion (FN331-model). The FN331-model explains both the number of fermion families in nature and their mass hierarchy simultaneously. In this work we study the Higgs mediated quark FCNCs in FN331-model. The flavour violating couplings of quarks are suppressed by the ratio of the $SU(2)_L \times U(1)_Y$ and $SU(3)_L\times U(1)_X$ breaking scales. We find that the $SU(3)_L\times U(1)_X$-breaking scale can be as low as 5 TeV in order to pass the flavour bounds.
    Yukawa couplingFroggatt-Nielsen mechanismFlavour Changing Neutral CurrentsFlavourStandard ModelHiggs bosonQuark massCabibbo-Kobayashi-Maskawa matrixCharged leptonNeutrino mass hierarchy...
  • The ultra-light scalar fields can arise ubiquitously, for instance, as a result of the spontaneous breaking of an approximate symmetry such as the axion and more generally the axion-like particles. In addition to the particle physics motivations, these particles can also play a major role in cosmology by contributing to dark matter abundance and affecting the structure formation at sub-Mpc scales. In this paper, we propose to use the 21cm forest observations to probe the nature of ultra-light dark matter. The 21cm forest can probe much smaller scales than the Lyman-$\alpha$ forest, that is, $k\gtrsim 10\mathrm{Mpc}^{-1}$. We explore the range of the ultra-light dark matter mass $m_{u}$ and $f_u$, the fraction of ultra-light dark matter with respect to the total matter, which can be probed by the 21cm forest. We find that 21cm forest can potentially put the dark matter mass lower bound $m_u \gtrsim 10^{-18}$ eV for $f_u=1$, which is 3 orders of magnitude bigger mass scale than those probed by the current Lyman-$\alpha$ forest observations.While the effects of the ultra-light particles on the structure formation become smaller when the dominant component of dark matter is composed of the conventional cold dark matter, we find that the 21cm forest is still powerful enough to probe the sub-component ultra-light dark matter mass up to the order of $10^{-19}$ eV. The Fisher matrix analysis shows that $(m_u,f_u)\sim (10^{-20}\mathrm{eV}, 0.3)$ is the most optimal parameter set which the 21cm forest can probe with the minimal errors for a sub-component ultra-light dark matter scenario.
    21cm forestUltra-light dark matterAbsorption lineAbundanceUltracompact minihaloDark matterLyman-alpha forestMatter power spectrumLight scalarDark matter particle mass...
  • The Fermilab Short-Baseline Neutrino (SBN) experiments, MicroBooNE, ICARUS, and SBND, are expected to have significant sensitivity to light weakly coupled hidden sector particles. Here we study the capability of the SBN experiments to probe dark scalars interacting through the Higgs portal. We investigate production of dark scalars using both the Fermilab Booster 8 GeV and NuMI 120 GeV proton beams, simulating kaons decaying to dark scalars and taking into account the beamline geometry. We also investigate strategies to mitigate backgrounds from beam-related neutrino scattering events. We find that SBND, with its comparatively short ${\cal O}(100\ {\rm m})$ baseline, will have the best sensitivity to scalars produced with Booster, while ICARUS, with its large detector volume, will provide the best limits on off-axis dark scalar production from NuMI. The SBN experiments can provide leading tests of dark scalars with masses in the 50 - 350 MeV range in the near term. Our results motivate dedicated experimental searches for dark scalars and other long-lived hidden sector states at these experiments.
    NeutrinoNuMIDark Higgs bosonMuonFermilabHiggs portalBeamlineKaonKaon decayPion...
  • Injection of high energy electromagnetic particles around the recombination epoch can modify the standard recombination history and therefore the CMB anisotropy power spectrum. Previous studies have put strong constraints on the amount of electromagnetic energy injection around the recombination era (redshifts $z\lesssim 4500$). However, energy injected in the form of energetic ($>$ keV) visible standard model particles is not deposited instantaneously. The considerable delay between the time of energy injection and the time when all energy is deposited to background baryonic gas and CMB photons, together with the extraordinary precision with which the CMB anisotropies have been measured, means that CMB anisotropies are sensitive to energy that was injected much before the epoch of recombination. We show that the CMB anisotropy power spectrum is sensitive to energy injection even at $z = 10000$, giving stronger constraints compared to big bang nucleosynthesis and CMB spectral distortions. We derive, using Planck CMB data, the constraints on long-lived unstable particles decaying at redshifts $z\lesssim 10000$ (lifetime $\tau_X\gtrsim 10^{11}$s) by explicitly evolving the electromagnetic cascades in the expanding Universe, thus extending previous constraints to lower particle lifetimes. We also revisit the BBN constraints and show that the delayed injection of energy is important for BBN constraints. We find that the constraints can be weaker by a factor of few to almost an order of magnitude, depending on the energy, when we relax the quasi-static or on-the-spot assumptions.
    RecombinationCMB temperature anisotropyIonizationBig bang nucleosynthesisPositronCMB spectral distortionsDark matterDark matter decayElectromagnetic cascadePlanck mission...
  • We study the impact of relativistic effects in the 3-dimensional cross-correlation between Lyman-$\alpha$ forest and quasars. Apart from the relativistic effects, which are dominated by the Doppler contribution, several systematic effects are also included in our analysis (intervening metals, unidentified high column density systems, transverse proximity effect and effect of the UV fluctuations). We compute the signal-to-noise ratio for the Baryonic Oscillation Spectroscopic Survey (BOSS), the extended Baryonic Oscillation Spectroscopic Survey (eBOSS) and the Dark Energy Spectroscopic Instrument (DESI) surveys, showing that DESI will be able to detect the Doppler contribution in a Large Scale Structure (LSS) survey for the first time, with a S/N $>7$ for $r_{\rm min} > 10$ Mpc$/h$, where r$_{\rm min}$ denotes the minimum comoving separation between sources. We demonstrate that several physical parameters, introduced to provide a full modelling of the cross-correlation function, are affected by the Doppler contribution. By using a Fisher matrix approach, we establish that if the Doppler contribution is neglected in the data analysis, the derived parameters will be shifted by a non-negligible amount for the upcoming surveys.
    QuasarCross-correlationDark Energy Spectroscopic InstrumentBaryon Oscillation Spectroscopic SurveyTwo-point correlation functionSignal to noise ratioeBOSS surveyCovarianceRedshift-space distortionDoppler effect...
  • This is an idiosyncratic account of the main results presented at the 31st Rencontres de Blois, which took place from June 2nd to June 7th, 2019 in the Castle of Blois, France.
    Dark matterStandard ModelHiggs bosonLarge Hadron ColliderCosmologyBlack holeBSM physicsBaryon acoustic oscillationsColliderPrecision measurement...
  • The experimental results on the ratios of branching fractions $\mathcal{R}(D) = {\cal B}(\bar{B} \to D \tau^- \bar{\nu}_{\tau})/{\cal B}(\bar{B} \to D \ell^- \bar{\nu}_{\ell})$ and $\mathcal{R}(D^*) = {\cal B}(\bar{B} \to D^* \tau^- \bar{\nu}_{\tau})/{\cal B}(\bar{B} \to D^* \ell^- \bar{\nu}_{\ell})$, where $\ell$ denotes an electron or a muon, show a long-standing discrepancy with the Standard Model predictions, and might hint to a violation of lepton flavor universality. We report a new simultaneous measurement of $\mathcal{R}(D)$ and $\mathcal{R}(D^*)$, based on a data sample containing $772 \times 10^6$ $B\bar{B}$ events recorded at the $\Upsilon(4S)$ resonance with the Belle detector at the KEKB $e^+ e^-$ collider. In this analysis the tag-side $B$ meson is reconstructed in a semileptonic decay mode and the signal-side $\tau$ is reconstructed in a purely leptonic decay. The measured values are $\mathcal{R}(D)= 0.307 \pm 0.037 \pm 0.016$ and $\mathcal{R}(D^*) = 0.283 \pm 0.018 \pm 0.014$, where the first uncertainties are statistical and the second are systematic. These results are in agreement with the Standard Model predictions within $0.2$, $1.1$ and $0.8$ standard deviations for $\mathcal{R}(D)$, $\mathcal{R}(D^*)$ and their combination, respectively. This work constitutes the most precise measurements of $\mathcal{R}(D)$ and $\mathcal{R}(D^*)$ performed to date as well as the first result for \RD\ based on a semileptonic tagging method.
    Monte Carlo methodStandard ModelBranching ratioSystematic errorData samplingMuonDecay modeKEKBSemileptonic decayPion...
  • Recent experimental results of ${\cal R}(D^{(*)})$ deviate from the standard model (SM) by $3.1\sigma$, suggesting a new physics (NP) that affects the $b\to c \tau \bar\nu_\tau$ transition. Motivated by this, we investigate the possible NP effects in the $\Lambda_b\to\Lambda_c \tau\bar\nu_\tau$ decay. For this purpose, assuming the neutrinos are left-handed, we calculate in detail the helicity amplitudes of $\Lambda_b\to\Lambda_c \ell\bar\nu_\ell$ decays with all possible four-fermion operators. Within the latest results of $\Lambda_b\to\Lambda_c$ form factors from lattice QCD calculations, we study these decays in a model-independent manner. The differential and total branching fractions and other observables are calculated. In SM, we obtain the ratio ${\cal R}(\Lambda_c)=0.33\pm0.01$. Supposing that NP only affects the third generation fermions, we present the correlations among ${\cal R}(D)$, ${\cal R}(D^*)$ and ${\cal R}(\Lambda_c)$. We perform a minimum $\chi^2$ fit of the wilson coefficient of each operator to the latest experimental data of different observables. It is found that the left-handed scalar operator ${\cal O}_{SL}$ affects the branching fraction remarkably, and the ratio ${\cal R}(\Lambda_c)$ can be enhanced by $30\%$. For other operators, the ratio amounts to $0.38\pm0.02$, which is larger than prediction of SM by $20\%$. Using the fitted values of the wilson coefficients of the single NP operators, we also give a prognosis for the physical observables of $\Lambda_b\to\Lambda_c \tau\bar\nu_\tau$. Furthermore, we also study the effects of three typical NP models on $\Lambda_b\to\Lambda_c \tau\bar\nu_\tau$. We hope our results can be tested in the current LHCb experiment and the future high energy experiments.
    Standard ModelBranching ratioWilson coefficientsForm factorHelicityLeptoquarkForward-backward asymmetryLHCb experimentLattice QCDLepton flavour universality...
  • We use lattice QCD to calculate the form factors $f_+(q^2)$ and $f_0(q^2)$ for the semileptonic decay $B_s\to K\ell\nu$. Our calculation uses six MILC asqtad 2+1 flavor gauge-field ensembles with three lattice spacings. At the smallest and largest lattice spacing the light-quark sea mass is set to 1/10 the strange-quark mass. At the intermediate lattice spacing, we use four values for the light-quark sea mass ranging from 1/5 to 1/20 of the strange-quark mass. We use the asqtad improved staggered action for the light valence quarks, and the clover action with the Fermilab interpolation for the heavy valence bottom quark. We use SU(2) hard-kaon heavy-meson rooted staggered chiral perturbation theory to take the chiral-continuum limit. A functional $z$ expansion is used to extend the form factors to the full kinematic range. We present predictions for the differential decay rate for both $B_s\to K\mu\nu$ and $B_s\to K\tau\nu$. We also present results for the forward-backward asymmetry, the lepton polarization asymmetry, ratios of the scalar and vector form factors for the decays $B_s\to K\ell\nu$ and $B_s\to D_s \ell\nu$. Our results, together with future experimental measurements, can be used to determine the magnitude of the Cabibbo-Kobayashi-Maskawa matrix element $|V_{ub}|$.
    Form factorKaonTwo-point correlation functionQuark massDecay rateLattice QCDPerturbation theoryValence quarkHeavy quarkLight quark...
  • In this work we investigate the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting. Our main contribution is a thorough evaluation of networks of increasing depth using an architecture with very small (3x3) convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers. These findings were the basis of our ImageNet Challenge 2014 submission, where our team secured the first and the second places in the localisation and classification tracks respectively. We also show that our representations generalise well to other datasets, where they achieve state-of-the-art results. We have made our two best-performing ConvNet models publicly available to facilitate further research on the use of deep visual representations in computer vision.
    RegularizationTraining setStatisticsArchitectureImage ProcessingSecurityClassificationComplementarityHigh Performance ComputingClassification systems...
  • The recent introduction of learned indexes has shaken the foundations of the decades-old field of indexing data structures. Combining, or even replacing, classic design elements such as B-tree nodes with machine learning models has proven to give outstanding improvements in the space footprint and time efficiency of data systems. However, these novel approaches are based on heuristics, thus they lack any guarantees both in their time and space requirements. We propose the Piecewise Geometric Model index (shortly, PGM-index), which achieves guaranteed I/O-optimality in query operations, learns an optimal number of linear models, and its peculiar recursive construction makes it a purely learned data structure, rather than a hybrid of traditional and learned indexes (such as RMI and FITing-tree). We show that the PGM-index improves the space of the FITing-tree by 63.3% and of the B-tree by more than four orders of magnitude, while achieving their same or even better query time efficiency. We complement this result by proposing three variants of the PGM-index. First, we design a compressed PGM-index that further reduces its space footprint by exploiting the repetitiveness at the level of the learned linear models it is composed of. Second, we design a PGM-index that adapts itself to the distribution of the queries, thus resulting in the first known distribution-aware learned index to date. Finally, given its flexibility in the offered space-time trade-offs, we propose the multicriteria PGM-index that efficiently auto-tune itself in a few seconds over hundreds of millions of keys to the possibly evolving space-time constraints imposed by the application of use. We remark to the reader that this paper is an extended and improved version of our previous paper titled "Superseding traditional indexes by orchestrating learning and geometry" (arXiv:1903.00507).
    Data structuresMultidimensional ArrayCachingRankConvex hullMachine learningEntropyInternet of ThingsBig dataCompleteness...
  • The standard cosmological model successfully describes many observations from widely different epochs of the Universe, from primordial nucleosynthesis to the accelerating expansion of the present day. However, as the basic cosmological parameters of the model are being determined with increasing and unprecedented precision, it is not guaranteed that the same model will fit more precise observations from widely different cosmic epochs. Discrepancies developing between observations at early and late cosmological time may require an expansion of the standard model, and may lead to the discovery of new physics. The workshop "Tensions between the Early and the Late Universe" was held at the Kavli Institute for Theoretical Physics on July 15-17 2019 (More details of the workshop (including on-line presentations) are given at the website: https://www.kitp.ucsb.edu/activities/enervac-c19) to evaluate increasing evidence for these discrepancies, primarily in the value of the Hubble constant as well as ideas recently proposed to explain this tension. Multiple new observational results for the Hubble constant were presented in the time frame of the workshop using different probes: Cepheids, strong lensing time delays, tip of the red giant branch (TRGB), megamasers, Oxygen-rich Miras and surface brightness fluctuations (SBF) resulting in a set of six new ones in the last several months. Here we present the summary plot of the meeting that shows combining any three independent approaches to measure H$_0$ in the late universe yields tension with the early Universe values between 4.0$\sigma$ and 5.8$\sigma$. This shows that the discrepancy does not appear to be dependent on the use of any one method, team, or source. Theoretical ideas to explain the discrepancy focused on new physics in the decade of expansion preceding recombination as the most plausible. This is a brief summary of the workshop.
    Tip of the red giant branchCepheidThe early UniverseMiraCosmic microwave backgroundSupernova Type IaBaryon acoustic oscillationsHubble constantCosmic distance ladderSound horizon...
  • We present Atacama Large Millimetre Array and Atacama Compact Array observations of the Sunyaev-Zel'dovich effect in the z = 2 galaxy cluster Cl J1449+0856, an X-ray-detected progenitor of typical massive clusters in the present day Universe. While in a cleaned but otherwise untouched 92 GHz map of this cluster, little to no negative signal is visible, careful subtraction of known sub-millimetre emitters in the uv plane reveals a decrement at 5$\sigma$ significance. The total signal is -190$\pm$36 $\mu$Jy, with a peak offset by 5"-9" ($\sim$50 kpc) from both the X-ray centroid and the still-forming brightest cluster galaxy. A comparison of the recovered uv-amplitude profile of the decrement with different pressure models allows us to derive total mass constraints consistent with the $\sim$6$\times$10$^{13}$ M$_{\odot}$ estimated from X-ray data. Moreover, we find no strong evidence for a deviation of the pressure profile with respect to local galaxy clusters, although a slight tension at small-to-intermediate spatial scales suggests a flattened central profile, opposite to what seen in a cool core and possibly an AGN-related effect. This analysis of the lowest mass single SZ detection so far illustrates the importance of interferometers when observing the SZ effect in high-redshift clusters, the cores of which cannot be considered quiescent, such that careful subtraction of galaxy emission is necessary.
    Cluster of galaxiesGalaxyIntra-cluster mediumPoint sourceMultidimensional ArrayPressure profileSignal to noise ratioActive Galactic NucleiSunyaev-Zel'dovich effectCool core galaxy cluster...