• #### Symbolic Music Genre Transfer with CycleGAN

Deep generative models such as Variational Autoencoders (VAEs) and Generative Adversarial Networks (GANs) have recently been applied to style and domain transfer for images, and in the case of VAEs, music. GAN-based models employing several generators and some form of cycle consistency loss have been among the most successful for image domain transfer. In this paper we apply such a model to symbolic music and show the feasibility of our approach for music genre transfer. Evaluations using separate genre classifiers show that the style transfer works well. In order to improve the fidelity of the transformed music, we add additional discriminators that cause the generators to keep the structure of the original music mostly intact, while still achieving strong genre transfer. Visual and audible results further show the potential of our approach. To the best of our knowledge, this paper represents the first application of GANs to symbolic music domain transfer.
Convolutional neural networkArchitectureAutoencoderNeural networkGenerative modelGaussian noiseRecurrent neural networkDeep learningSoftwareMutual information...
• #### What Sets the Slope of the Molecular Kennicutt-Schmidt Relation?

The surface densities of molecular gas, $\Sigma_{\rm H_2}$, and star formation rate (SFR), $\dot\Sigma_\star$, correlate almost linearly on kiloparsec scales in the observed star-forming (non-starburst) galaxies. We explore the origin of the linear slope of this correlation using a suite of isolated $L_\star$ galaxy simulations. We show that in simulations with efficient feedback, the slope of the $\dot\Sigma_\star$-$\Sigma_{\rm H_2}$ relation on kiloparsec scales is insensitive to the slope of the $\dot\rho_\star$-$\rho$ relation assumed at the resolution scale. We also find that the slope on kiloparsec scales depends on the criteria used to identify star-forming gas, with a linear slope arising in simulations that identify star-forming gas using a virial parameter threshold. This behavior can be understood using a simple theoretical model based on conservation of interstellar gas mass as the gas cycles between atomic, molecular, and star-forming states under the influence of feedback and dynamical processes. In particular, we show that the linear slope emerges when feedback efficiently regulates and stirs the evolution of dense, molecular gas. We show that the model also provides insights into the likely origin of the relation between the SFR and molecular gas in real galaxies on different scales.
Star formationOf starsStar formation rateInterstellar mediumStar-forming regionStarGalaxyStar formation efficiencyGalaxy simulationsVelocity dispersion...
• #### A precision test of the nature of Dark Matter and a probe of the QCD phase transitionver. 2

If dark matter (DM) contains equal numbers of u,d,s quarks, the ratio of DM and ordinary matter densities is shown to follow from the Boltzmann distribution in the Quark Gluon Plasma. For sexaquark DM in the 1860-1880 MeV mass range (assuring sexaquark and nuclear stability) and quark masses and transition temperature from lattice QCD, the observed Omega_{DM}/Omega_b = 5.3 is in the predicted range, with <~ 15% uncertainty. The prediction is insensitive to the current form of DM, which could be sexaquarks, strange quark matter nuggets, primordial black holes from their collapse, or a mixture of these.
Dark matterPrimordial black holeQuark-gluon plasmaAbundanceQCD phase transitionLattice QCDPrecisionQuark massDibaryonHyperon...
• #### From charge- and spin-ordering to superconductivity in the organic charge-transfer solidsver. 2

We review recent progress in understanding the different spatial broken symmetries that occur in the normal states of the family of charge-transfer solids (CTS) that exhibit superconductivity (SC), and discuss how this knowledge gives insight to the mechanism of the unconventional SC in these systems. We show that a unified theory of the diverse broken symmetry states necessarily requires explicit incorporation of strong electron-electron interactions and lattice discreteness, and most importantly, the correct bandfilling of one-quarter. Uniquely in the quarter-filled band, there is a very strong tendency to form nearest neighbor spin-singlets, in both one and two dimensions. The tendency to spin-singlets, a quantum effect, drives a commensurate charge-order in the correlated quarter-filled band. This charge-ordered spin-singlet, which we label as a paired-electron crystal (PEC), is different from and competes with both the antiferromagnetic state and the Wigner crystal of single electrons. Further, unlike these classical broken symmetries, the PEC is characterized by a spin gap. The tendency to the PEC in two dimensions is enhanced by lattice frustration. Following this characterization of the spatial broken symmetries, we critically reexamine spin-fluctuation and resonating valence bond theories of frustration-driven SC within half-filled band Hubbard and Hubbard-Heisenberg Hamiltonians for the superconducting CTS. We develop a valence-bond theory of SC within which the superconducting state is reached by the destabilization of the PEC by additional pressure-induced lattice frustration that makes the spin-singlets mobile. Our proposed mechanism for SC is the same for CTS in which the proximate semiconducting state is antiferromagnetic instead of charge-ordered, with the only difference that SC in the former is generated via a fluctuating spin-singlet state as opposed to static PEC.
AntiferromagneticCharge orderingSpin gapSuperconductorPhase diagramNuclear magnetic resonanceAnionsDopingInstabilityCharge density wave...
• #### FRAGE: Frequency-Agnostic Word Representation

Continuous word representation (aka word embedding) is a basic building block in many neural network-based models used in natural language processing tasks. Although it is widely accepted that words with similar semantics should be close to each other in the embedding space, we find that word embeddings learned in several tasks are biased towards word frequency: the embeddings of high-frequency and low-frequency words lie in different subregions of the embedding space, and the embedding of a rare word and a popular word can be far from each other even if they are semantically similar. This makes learned word embeddings ineffective, especially for rare words, and consequently limits the performance of these neural network models. In this paper, we develop a neat, simple yet effective way to learn \emph{FRequency-AGnostic word Embedding} (FRAGE) using adversarial training. We conducted comprehensive studies on ten datasets across four natural language processing tasks, including word similarity, language modeling, machine translation and text classification. Results show that with FRAGE, we achieve higher performance than the baselines in all tasks.
EmbeddingComputational linguisticsMachine translationNeural networkWord embeddingText ClassificationWord vectorsNetwork modelDeep learningClassification...
• #### WikiReading: A Novel Large-scale Language Understanding Task over Wikipediaver. 2

We present WikiReading, a large-scale natural language understanding task and publicly-available dataset with 18 million instances. The task is to predict textual values from the structured knowledge base Wikidata by reading the text of the corresponding Wikipedia articles. The task contains a rich variety of challenging classification and extraction sub-tasks, making it well-suited for end-to-end models such as deep neural networks (DNNs). We compare various state-of-the-art DNN-based architectures for document classification, information extraction, and question answering. We find that models supporting a rich answer space, such as word or character sequences, perform best. Our best-performing model, a word-level sequence to sequence model with a mechanism to copy out-of-vocabulary words, obtains an accuracy of 71.8%.
Recurrent neural networkClassificationEmbeddingNatural languageKnowledge baseArchitectureDocument classificationEntropyText ClassificationDeep Neural Networks...
• #### Revisiting Small Batch Training for Deep Neural Networks

Modern deep neural network training is typically based on mini-batch stochastic gradient optimization. While the use of large mini-batches increases the available computational parallelism, small batch training has been shown to provide improved generalization performance and allows a significantly smaller memory footprint, which might also be exploited to improve machine throughput. In this paper, we review common assumptions on learning rate scaling and training duration, as a basis for an experimental comparison of test performance for different mini-batch sizes. We adopt a learning rate that corresponds to a constant average weight update per gradient calculation (i.e., per unit cost of computation), and point out that this results in a variance of the weight updates that increases linearly with the mini-batch size $m$. The collected experimental results for the CIFAR-10, CIFAR-100 and ImageNet datasets show that increasing the mini-batch size progressively reduces the range of learning rates that provide stable convergence and acceptable test performance. On the other hand, small mini-batch sizes provide more up-to-date gradient calculations, which yields more stable and reliable training. The best performance has been consistently obtained for mini-batch sizes between $m = 2$ and $m = 32$, which contrasts with recent work advocating the use of mini-batch sizes in the thousands.
OptimizationDeep Neural NetworksTraining setArchitectureSchedulingStochastic approximationStatisticsDeep learningNetwork modelCovariance matrix...
• #### MS MARCO: A Human Generated MAchine Reading COmprehension Datasetver. 2

Deep learningClassificationCrowdsourcingRecurrent neural networkNatural languageConvolutional neural networkComputational linguisticsInformation retrievalGenerative modelMachine learning...
• #### QANet: Combining Local Convolution with Global Self-Attention for Reading Comprehension

Current end-to-end machine reading and question answering (Q\&A) models are primarily based on recurrent neural networks (RNNs) with attention. Despite their success, these models are often slow for both training and inference due to the sequential nature of RNNs. We propose a new Q\&A architecture called QANet, which does not require recurrent networks: Its encoder consists exclusively of convolution and self-attention, where convolution models local interactions and self-attention models global interactions. On the SQuAD dataset, our model is 3x to 13x faster in training and 4x to 9x faster in inference, while achieving equivalent accuracy to recurrent models. The speed-up gain allows us to train the model with much more data. We hence combine our model with data generated by backtranslation from a neural machine translation model. On the SQuAD dataset, our single model, trained with augmented data, achieves 84.6 F1 score on the test set, which is significantly better than the best published F1 score of 81.8.
Recurrent neural networkInferenceEmbeddingF1 scoreArchitectureMachine translationComputational linguisticsLong short term memoryText ClassificationWord embedding...
• #### Context is Everything: Finding Meaning Statistically in Semantic Spacesver. 5

This paper introduces Contextual Salience (CoSal), a simple and explicit measure of a word's importance in context which is a more theoretically natural, practically simpler, and more accurate replacement to tf-idf. CoSal supports very small contexts (20 or more sentences), out-of context words, and is easy to calculate. A word vector space generated with both bigram phrases and unigram tokens reveals that contextually significant words disproportionately define phrases. This relationship is applied to produce simple weighted bag-of-words sentence embeddings. This model outperforms SkipThought and the best models trained on unordered sentences in most tests in Facebook's SentEval, beats tf-idf on all available tests, and is generally comparable to the state of the art. This paper also applies CoSal to sentence and document summarization and an improved and context-aware cosine distance. Applying the premise that unexpected words are important, CoSal is presented as a replacement for tf-idf and an intuitive measure of contextual word importance.
Word vectorsCovarianceVector spaceEmbeddingBag of words modelFacebookInductive transferLong short term memoryPrincipal componentSentiment analysis...
• #### Universal Sentence Encoderver. 2

We present models for encoding sentences into embedding vectors that specifically target transfer learning to other NLP tasks. The models are efficient and result in accurate performance on diverse transfer tasks. Two variants of the encoding models allow for trade-offs between accuracy and compute resources. For both variants, we investigate and report the relationship between model complexity, resource consumption, the availability of transfer task training data, and task performance. Comparisons are made with baselines that use word level transfer learning via pretrained word embeddings as well as baselines do not use any transfer learning. We find that transfer learning using sentence embeddings tends to outperform word level transfer. With transfer learning via sentence embeddings, we observe surprisingly good performance with minimal amounts of supervised training data for a transfer task. We obtain encouraging results on Word Embedding Association Tests (WEAT) targeted at detecting model bias. Our pre-trained sentence encoding models are made freely available for download and on TF Hub.
Inductive transferEmbeddingComputational linguisticsArchitectureClassificationWord embeddingConvolutional neural networkDeep Neural NetworksGraphTraining set...
• #### Universal Transformers

Recurrent neural networkMachine translationArchitectureLong short term memoryInductive biasNatural languageHidden stateRankingConvolutional neural networkInference...
• #### A Decomposable Attention Model for Natural Language Inferencever. 2

We propose a simple neural architecture for natural language inference. Our approach uses attention to decompose the problem into subproblems that can be solved separately, thus making it trivially parallelizable. On the Stanford Natural Language Inference (SNLI) dataset, we obtain state-of-the-art results with almost an order of magnitude fewer parameters than previous work and without relying on any word-order information. Adding intra-sentence attention that takes a minimum amount of order into account yields further improvements.
InferenceNatural languageArchitectureHyperparameterEmbeddingRegularizationNeural networkHidden layerWord embeddingPairwise comparisons...
• #### Attention Is All You Needver. 5

The dominant sequence transduction models are based on complex recurrent or convolutional neural networks in an encoder-decoder configuration. The best performing models also connect the encoder and decoder through an attention mechanism. We propose a new simple network architecture, the Transformer, based solely on attention mechanisms, dispensing with recurrence and convolutions entirely. Experiments on two machine translation tasks show these models to be superior in quality while being more parallelizable and requiring significantly less time to train. Our model achieves 28.4 BLEU on the WMT 2014 English-to-German translation task, improving over the existing best results, including ensembles by over 2 BLEU. On the WMT 2014 English-to-French translation task, our model establishes a new single-model state-of-the-art BLEU score of 41.8 after training for 3.5 days on eight GPUs, a small fraction of the training costs of the best models from the literature. We show that the Transformer generalizes well to other tasks by applying it successfully to English constituency parsing both with large and limited training data.
TransductionArchitectureRecurrent neural networkMachine translationConvolutional neural networkEmbeddingPath lengthHidden layerHidden stateInference...
• #### Troubling Trends in Machine Learning Scholarshipver. 2

Collectively, machine learning (ML) researchers are engaged in the creation and dissemination of knowledge about data-driven algorithms. In a given paper, researchers might aspire to any subset of the following goals, among others: to theoretically characterize what is learnable, to obtain understanding through empirically rigorous experiments, or to build a working system that has high predictive accuracy. While determining which knowledge warrants inquiry may be subjective, once the topic is fixed, papers are most valuable to the community when they act in service of the reader, creating foundational knowledge and communicating as clearly as possible. Recent progress in machine learning comes despite frequent departures from these ideals. In this paper, we focus on the following four patterns that appear to us to be trending in ML scholarship: (i) failure to distinguish between explanation and speculation; (ii) failure to identify the sources of empirical gains, e.g., emphasizing unnecessary modifications to neural architectures when gains actually stem from hyper-parameter tuning; (iii) mathiness: the use of mathematics that obfuscates or impresses rather than clarifies, e.g., by confusing technical and non-technical concepts; and (iv) misuse of language, e.g., by choosing terms of art with colloquial connotations or by overloading established technical terms. While the causes behind these patterns are uncertain, possibilities include the rapid expansion of the community, the consequent thinness of the reviewer pool, and the often-misaligned incentives between scholarship and short-term measures of success (e.g., bibliometrics, attention, and entrepreneurial opportunity). While each pattern offers a corresponding remedy (don't do it), we also discuss some speculative suggestions for how the community might combat these trends.
Machine learningAblationArchitectureDeep learningNeural networkGenerative modelNatural languageBibliometricsClassificationPrecision...
• #### Deep contextualized word representationsver. 2

We introduce a new type of deep contextualized word representation that models both (1) complex characteristics of word use (e.g., syntax and semantics), and (2) how these uses vary across linguistic contexts (i.e., to model polysemy). Our word vectors are learned functions of the internal states of a deep bidirectional language model (biLM), which is pre-trained on a large text corpus. We show that these representations can be easily added to existing models and significantly improve the state of the art across six challenging NLP problems, including question answering, textual entailment and sentiment analysis. We also present an analysis showing that exposing the deep internals of the pre-trained network is crucial, allowing downstream models to mix different types of semi-supervision signals.
Long short term memoryArchitectureComputational linguisticsWord vectorsWord Sense DisambiguationPart-of-speechTraining setRecurrent neural networkRegularizationNearest-neighbor site...
• #### Evaluation of sentence embeddings in downstream and linguistic probing tasks

Despite the fast developmental pace of new sentence embedding methods, it is still challenging to find comprehensive evaluations of these different techniques. In the past years, we saw significant improvements in the field of sentence embeddings and especially towards the development of universal sentence encoders that could provide inductive transfer to a wide variety of downstream tasks. In this work, we perform a comprehensive evaluation of recent methods using a wide variety of downstream and linguistic feature probing tasks. We show that a simple approach using bag-of-words with a recently introduced language model for deep context-dependent word embeddings proved to yield better results in many tasks when compared to sentence encoders trained on entailment datasets. We also show, however, that we are still far away from a universal encoder that can perform consistently across several downstream tasks.
EmbeddingBag of words modelLong short term memoryClassificationInductive transferLogistic regressionCOCO simulationArchitectureComputational linguisticsWord embedding...
• #### Document Classification by Inversion of Distributed Language Representationsver. 3

There have been many recent advances in the structure and measurement of distributed language models: those that map from words to a vector-space that is rich in information about word choice and composition. This vector-space is the distributed language representation. The goal of this note is to point out that any distributed representation can be turned into a classifier through inversion via Bayes rule. The approach is simple and modular, in that it will work with any language representation whose training can be formulated as optimizing a probability model. In our application to 2 million sentences from Yelp reviews, we also find that it performs as well as or better than complex purpose-built algorithms.
Logistic regressionClassificationStarRegressionDocument classificationVector spaceDistributed bag-of-wordsBayes' ruleWord vectorsComputational linguistics...
• #### A Hexagon in Saturn's Northern Stratosphere Surrounding the Emerging Summertime Polar Vortex

Saturn's polar stratosphere exhibits the seasonal growth and dissipation of broad, warm, vortices poleward of $\sim75^\circ$ latitude, which are strongest in the summer and absent in winter. The longevity of the exploration of the Saturn system by Cassini allows the use of infrared spectroscopy to trace the formation of the North Polar Stratospheric Vortex (NPSV), a region of enhanced temperatures and elevated hydrocarbon abundances at millibar pressures. We constrain the timescales of stratospheric vortex formation and dissipation in both hemispheres. Although the NPSV formed during late northern spring, by the end of Cassini's reconnaissance (shortly after northern summer solstice), it still did not display the contrasts in temperature and composition that were evident at the south pole during southern summer. The newly-formed NPSV was bounded by a strengthening stratospheric thermal gradient near $78^\circ$N. The emergent boundary was hexagonal, suggesting that the Rossby wave responsible for Saturn's long-lived polar hexagon - which was previously expected to be trapped in the troposphere - can influence the stratospheric temperatures some 300 km above Saturn's clouds.
SaturnAbundanceStratosphereSummer solsticeTroposphereTime SeriesRossby waveDissipationSpectral resolutionVorticity...
• #### Particle acceleration and the origin of the very high energy emission around black holes and relativistic jets

Particle acceleration induced by fast magnetic reconnection may help to solve current puzzles related to the interpretation of the very high energy (VHE) and neutrino missions from AGNs and compact sources in general. Our general relativistic-MHD simulations of accretion disk-corona systems reveal the growth of turbulence driven by MHD instabilities that lead to the development of fast magnetic reconnection in the corona. In addition, our simulations of relativistic MHD jets reveal the formation of several sites of fast reconnection induced by current-driven kink turbulence. The injection of thousands of test particles in these regions cause acceleration up to energies of several PeVs, thus demonstrating the ability of this process to accelerate particles and produce VHE and neutrino emission, specially in blazars. Finally, we discuss how reconnection can also explain the observed VHE luminosity-black hole mass correlation, involving hundreds of non-blazar sources like Perseus A, and black hole binaries.
Black holeBlazarNeutrinoRelativistic jetFast Magnetic ReconnectionMagnetic reconnectionLuminosityRelativistic magnetohydrodynamicsTurbulenceActive Galactic Nuclei...
• #### Relaxion Dark Matter

We highlight a new connection between the Standard Model hierarchy problem and the dark matter sector. The key piece is the relaxion field, which besides scanning the Higgs mass and setting the electroweak scale, also constitutes the observed dark matter abundance of the universe. The relaxation mechanism is realized during inflation, and the necessary friction is provided by particle production, with no need for a large number of e-folds and no new physics at the TeV scale. Using this framework we show that the relaxion is a phenomenologically viable dark matter candidate in the keV mass range.
Dark matterStandard ModelInflationRelaxationElectroweak scaleHiggs boson massHiggs bosonAbundanceDark matter abundanceHiggs field...
• #### Block Chain based Intelligent Industrial Network (DSDIN)

The manufacturing industry featured centralization in the past due to technical limitations, and factories (especially large manufacturers) gathered almost all of the resources for manufacturing, including: technologies, raw materials, equipment, workers, market information, etc. However, such centralized production is costly, inefficient and inflexible, and difficult to respond to rapidly changing, diverse and personalized user needs. This paper introduces an Intelligent Industrial Network (DSDIN), which provides a fully distributed manufacturing network where everyone can participate in manufacturing due to decentralization and no intermediate links, allowing them to quickly get the products or services they want and also to be authorized, recognized and get returns in a low-cost way due to their efforts (such as providing creative ideas, designs or equipment, raw materials or physical strength). DSDIN is a blockchain based IoT and AI technology platform, and also an IoT based intelligent service standard. Due to the intelligent network formed by DSDIN, the manufacturing center is no longer a factory, and actually there are no manufacturing centers. DSDIN provides a multi-participation peer-to-peer network for people and things (including raw materials, equipment, finished / semi-finished products, etc.). The information transmitted through the network is called Intelligent Service Algorithm (ISA). The user can send a process model, formula or control parameter to a device via an ISA, and every transaction in DSDIN is an intelligent service defined by ISA.
Internet of ThingsMarketPeer-to-peer networkNetworksMaterialsAlgorithms...
• #### Herschel and Spitzer observations of slowly rotating, nearby isolated neutron stars

Supernova fallback disks around neutron stars have been discussed to influence the evolution of the diverse neutron star populations. Slowly rotating neutron stars are most promising to find such disks. Searching for the cold and warm debris of old fallback disks, we carried out Herschel PACS (70 $\mu$m, 160 $\mu$m) and Spitzer IRAC (3.6 $\mu$m, 4.5 $\mu$m) observations of eight slowly rotating ($P\approx 3 - 11$ s) nearby ($<1$ kpc) isolated neutron stars. Herschel detected 160 $\mu$m emission ($>5\sigma$) at locations consistent with the positions of the neutron stars RX J0806.4-4123 and RX J2143.0+0654. No other significant infrared emission was detected from the eight neutron stars. We estimate probabilities of 63%, 33% and 3% that, respectively, none, one, or both Herschel PACS 160 $\mu$m detections are unrelated excess sources due to background source confusion or an interstellar cirrus. If the 160 $\mu$m emission is indeed related to cold (10 K to 22 K) dust around the neutron stars, this dust is absorbing and re-emitting $\sim 10$% to $\sim 20$% of the neutron stars' X-rays. Such high efficiencies would be at least three orders of magnitude larger than the efficiencies of debris disks around nondegenerate stars. While thin dusty disks around the neutron stars can be excluded as counterparts of the 160 $\mu$m emission, dusty asteroid belts constitute a viable option.
Neutron starDust grainMagnetarPoint sourceProper motionLuminosityDebris discStarSpitzer Space TelescopeSupernova...
• #### A general explicit form for higher order approximations for fractional derivatives and its consequences

A general explicit form for generating functions for approximating fractional derivatives is derived. To achieve this, an equivalent characterisation for consistency and order of approximations established on a general generating function is used to form a linear system of equations with Vandermonde matrix for the coefficients of the generating function which is in the form of power of a polynomial. This linear system is solved for the coefficients of the polynomial in the generating function. These generating functions completely characterise Gr\"unwald type approximations with shifts and order of accuracy. Incidentally, the constructed generating functions happen to be generalization of the previously known Lubich forms of generating functions without shift. As a consquence, a general explicit form for new finite difference formulas for integer-order derivatives with any order of accuracy are derived.
Finite differenceVandermonde determinantStencilDiffusion equationFractional calculusAbsolute convergenceDiscretizationConvex combinationAnomalous transportOrdinary differential equations...
• #### Integral representation of solutions to higher-order fractional Dirichlet problems on ballsver. 4

We provide closed formulas for (unique) solutions of nonhomogeneous Dirichlet problems on balls involving any positive power $s>0$ of the Laplacian. We are able to prescribe values outside the domain and boundary data of different orders using explicit Poisson-type kernels and a new notion of higher-order boundary operator, which recovers normal derivatives if $s$ is a natural number. Our results unify and generalize previous approaches in the study of polyharmonic operators and fractional Laplacians. As applications, we show a novel characterization of $s$-harmonic functions in terms of Martin kernels, a higher-order fractional Hopf Lemma, and examples of positive and sign-changing Green functions.
Green's functionWeak solutionPoisson kernelHarmonic functionFractional LaplacianDirichlet problemMaximum principleNonnegativeBounded setFundamental lemma of calculus of variations...
• #### AMC: AutoML for Model Compression and Acceleration on Mobile Devicesver. 3

Model compression is a critical technique to efficiently deploy neural network models on mobile devices which have limited computation resources and tight power budgets. Conventional model compression techniques rely on hand-crafted heuristics and rule-based policies that require domain experts to explore the large design space trading off among model size, speed, and accuracy, which is usually sub-optimal and time-consuming. In this paper, we propose AutoML for Model Compression (AMC) which leverage reinforcement learning to provide the model compression policy. This learning-based compression policy outperforms conventional rule-based compression policy by having higher compression ratio, better preserving the accuracy and freeing human labor. Under 4x FLOPs reduction, we achieved 2.7% better accuracy than the hand- crafted model compression policy for VGG-16 on ImageNet. We applied this automated, push-the-button compression pipeline to MobileNet and achieved 1.81x speedup of measured inference latency on an Android phone and 1.43x speedup on the Titan XP GPU, with only 0.1% loss of ImageNet Top-1 accuracy.
InferenceReinforcement learningNeural networkDeep Neural NetworksArchitectureSaturnian satellitesGoogle.comMobile phoneHidden layerNetwork model...
• #### Linking galaxy structural properties and star formation activity with IllustrisTNG

We study the connection between active galactic nuclei (AGN) and their host galaxies through cosmic time in the large-scale cosmological IllustrisTNG simulations. We first compare BH properties, i.e. the hard X-ray BH luminosity function, AGN galaxy occupation fraction, and distribution of Eddington ratios, to available observational constraints. The simulations produce a population of BHs in good agreement with observations, but we note an excess of faint AGN in hard X-ray (L_x ~ 10^{43-44} erg/s), and a lower number of bright AGN (L_x>10^{44} erg/s), a conclusion that varies quantitatively but not qualitatively with BH luminosity estimation method. The lower Eddington ratios of the 10^{9} Msun BHs compared to observations suggest that AGN feedback may be too efficient in this regime. We study galaxy star formation activity and structural properties, and design sample-dependent criteria to identify different galaxy types (star-forming/quiescent, extended/compact) that we apply both to the simulations and observations from the candels fields. We analyze how the simulated and observed galaxies populate the specific star formation rate - stellar mass surface density diagram. A large fraction of the z=0 M_{star}>10^{11} Msun quiescent galaxies first experienced a compaction phase (i.e. reduction of galaxy size) while still forming stars, and then a quenching event. We measure the dependence of AGN fraction on galaxies' locations in this diagram. After correcting the simulations with a redshift and AGN luminosity-dependent model for AGN obscuration, we find good qualitative and quantitative agreement with observations. The AGN fraction is the highest among compact star-forming galaxies (16-20% at z~1.5-2), and the lowest among compact quiescent galaxies (6-10% at z~1.5-2).
GalaxyActive Galactic NucleiBlack holeStar formationMassive galaxiesLuminosityStar-forming galaxyCompact starMilky WayAGN feedback...
• #### Starburst galaxies in semi-analytic models of galaxy formation and evolution

We study the shape and evolution of the star formation main sequence in three independently developed semi-analytic models of galaxy formation. We focus, in particular, on the characterization of the model galaxies that are significantly above the main sequence, and that can be identified with galaxies classified as starburst' in recent observational work. We find that, in all three models considered, star formation triggered by merger events (both minor and major) contribute to only a very small fraction of the cosmic density of star formation. While mergers are associated to bursts of star formation in all models, galaxies that experienced recent merger events are not necessarily significantly above the main sequence. On the other hand, starburst galaxies' are not necessarily associated with merger episodes, especially at the low-mass end. Galaxies that experienced recent mergers can have relatively low levels of star formation when/if the merger is gas-poor, and galaxies with no recent merger can experience episodes of starbursts due to a combination of large amount of cold gas available from cooling/accretion events and/or small disk radii which increases the cold gas surface density.
GalaxyStar formation rateStar formationStarburst galaxyMain sequence starOf starsStellar massMilky WayCoolingInstability...
• #### From Unified Field Theory to the Standard Model and Beyond

One hundred years ago this year attempts began to generalise general relativity with the ambition of incorporating electromagnetism alongside gravitation in a unified field theory. These developments led to gauge theories and models with extra spatial dimensions that have greatly influenced the modern-day pursuit of a unification scheme incorporating the Standard Model of particle physics, again ideally together with gravity. In this paper we motivate a further natural generalisation from extra spatial dimensions at an elementary level which is found to much more directly accommodate distinctive features of the Standard Model. We also investigate the potential to uncover new physical phenomena, making a case in the neutrino sector for one left-handed neutrino state to be massless, and emphasise the opportunity for a close collaboration between theory and experiment. The new theory possesses a very simple interpretation regarding the underlying source of these empirical structures.
Standard ModelUnified field theoryElectromagnetismActive neutrinoGauge theoryGeneral relativityNeutrinoTheoryDimensionsGravitation...
• #### Anomaly-free Dark Matter with Harmless Direct Detection Constraintsver. 2

Dark matter (DM) interacting with the SM fields via a $Z'-$boson ('$Z'$-portal') remains one of the most attractive WIMP scenarios, both from the theoretical and the phenomenological points of view. In order to avoid the strong constraints from direct detection and dilepton production, it is highly convenient that the $Z'$ has axial coupling to DM and leptophobic couplings to the SM particles, respectively. In this paper we first explore the conditions for an anomaly-free leptophobic $Z'$, which (if flavour-blind) has to coincide with that from gauged baryon-number in the SM sector. Then there are very few possibilities where, besides leptophobia, the coupling to DM is axial; namely four (quite similar) cases if the content of the dark sector is minimal. The resulting scenario is very predictive, and perfectly viable from the present constraints from DM detection, EW observables and LHC data (di-lepton, di-jet and mono-jet production). We analyze all these constraints, obtaining the allowed areas in the parameter space, which generically prefer $m_{Z'}\lesssim 500$ GeV, apart from resonant regions. The best chances to test these viable areas come from future LHC measurements.
Dark matterDark sectorLarge Hadron ColliderDark matter particleHiggs bosonMono-jetDark matter annihilationKinetic mixingBaryon numberScale of new physics...
• #### Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networksver. 3

We present weight normalization: a reparameterization of the weight vectors in a neural network that decouples the length of those weight vectors from their direction. By reparameterizing the weights in this way we improve the conditioning of the optimization problem and we speed up convergence of stochastic gradient descent. Our reparameterization is inspired by batch normalization but does not introduce any dependencies between the examples in a minibatch. This means that our method can also be applied successfully to recurrent models such as LSTMs and to noise-sensitive applications such as deep reinforcement learning or generative models, for which batch normalization is less well suited. Although our method is much simpler, it still provides much of the speed-up of full batch normalization. In addition, the computational overhead of our method is lower, permitting more optimization steps to be taken in the same amount of time. We demonstrate the usefulness of our method on applications in supervised image recognition, generative modelling, and deep reinforcement learning.
OptimizationNeural networkStochastic gradient descentGenerative modelArchitectureLong short term memoryReinforcement learningDeep Neural NetworksStatisticsFisher information matrix...
• #### Effective Approaches to Attention-based Neural Machine Translationver. 4

An attentional mechanism has lately been used to improve neural machine translation (NMT) by selectively focusing on parts of the source sentence during translation. However, there has been little work exploring useful architectures for attention-based NMT. This paper examines two simple and effective classes of attentional mechanism: a global approach which always attends to all source words and a local one that only looks at a subset of source words at a time. We demonstrate the effectiveness of both approaches over the WMT translation tasks between English and German in both directions. With local attention, we achieve a significant gain of 5.0 BLEU points over non-attentional systems which already incorporate known techniques such as dropout. Our ensemble model using different attention architectures has established a new state-of-the-art result in the WMT'15 English to German translation task with 25.9 BLEU points, an improvement of 1.0 BLEU points over the existing best system backed by NMT and an n-gram reranker.
Hidden stateArchitectureLong short term memoryRecurrent neural networkNeural networkSchedulingConvolutional neural networkMATLABHyperparameterClassification...
• #### Magnetic susceptibility of quantum spin systems calculated by sine square deformation: one-dimensional, square lattice, and kagome lattice Heisenberg antiferromagnets

We develop a simple and unbiased numerical method to obtain the uniform susceptibility of quantum many body systems. When a Hamiltonian is spatially deformed by multiplying it with a sine square function that smoothly decreases from the system center toward the edges, the size-scaling law of the excitation energy is drastically transformed to a rapidly converging one. Then, the local magnetization at the system center becomes nearly size independent; the one obtained for the deformed Hamiltonian of a system length as small as L=10 provides the value obtained for the original uniform Hamiltonian of L=100. This allows us to evaluate a bulk magnetic susceptibility by using the magnetization at the center by existing numerical solvers without any approximation, parameter tuning, or the size-scaling analysis. We demonstrate that the susceptibilities of the spin-1/2 antiferromagnetic Heisenberg chain and square lattice obtained by our scheme at L=10 agree within 10 to (-3) with exact analytical and numerical solutions for L=infinite down to temperature of 0.1 times the coupling constant. We apply this method to the spin-1/2 kagome lattice Heisenberg antiferromagnet which is of prime interest in the search of spin liquids.
HamiltonianMagnetizationKagome latticeAntiferromagnetDensity of statesFinite size effectAntiferromagneticScaling lawNumerical methodsEntropy...
• #### Ground-state phase diagram of the spin-1/2 square-lattice J1-J2 model with plaquette structure

Using the coupled cluster method for high orders of approximation and Lanczos exact diagonalization we study the ground-state phase diagram of a quantum spin-1/2 J1-J2 model on the square lattice with plaquette structure. We consider antiferromagnetic (J1>0) as well as ferromagnetic (J1<0) nearest-neighbor interactions together with frustrating antiferromagnetic next-nearest-neighbor interaction J2>0. The strength of inter-plaquette interaction lambda varies between lambda=1 (that corresponds to the uniform J1-J2 model) and lambda=0 (that corresponds to isolated frustrated 4-spin plaquettes). While on the classical level (s \to \infty) both versions of models (i.e., with ferro- and antiferromagnetic J1) exhibit the same ground-state behavior, the ground-state phase diagram differs basically for the quantum case s=1/2. For the antiferromagnetic case (J1 > 0) Neel antiferromagnetic long-range order at small J2/J1 and lambda \gtrsim 0.47 as well as collinear striped antiferromagnetic long-range order at large J2/J1 and lambda \gtrsim 0.30 appear which correspond to their classical counterparts. Both semi-classical magnetic phases are separated by a nonmagnetic quantum paramagnetic phase. The parameter region, where this nonmagnetic phase exists, increases with decreasing of lambda. For the ferromagnetic case (J1 < 0) we have the trivial ferromagnetic ground state at small J2/|J1|. By increasing of J2 this classical phase gives way for a semi-classical plaquette phase, where the plaquette block spins of length s=2 are antiferromagnetically long-range ordered. Further increasing of J2 then yields collinear striped antiferromagnetic long-range order for lambda \gtrsim 0.38, but a nonmagnetic quantum paramagnetic phase lambda \lesssim 0.38.
Phase diagramAntiferromagnetTwo-point correlation functionJ1 J2 modelCoupled cluster methodMagnetic orderQuantum phase transitionHeisenberg modelMagnetizationCritical line...
• #### Dark Energy in the Swamplandver. 2

In this Letter, we study the implications of string Swampland criteria for dark energy in view of ongoing and future cosmological observations. If string theory should be the ultimate quantum gravity theory, there is evidence that exact de Sitter solutions with a positive cosmological constant cannot describe the fate of the late-time universe. Even though cosmological models with dark energy given by a scalar field $\pi$ evolving in time are not in direct tension with string theory, they have to satisfy the Swampland criteria $|\Delta\pi|<d\sim\mathcal{O}(1)$ and $|V'|/V>c\sim\mathcal{O}(1)$, where $V$ is the scalar field potential. In view of the restrictive implications that the Swampland criteria have on dark energy, we investigate the accuracy needed for future observations to tightly constrain standard dark-energy models. We find that current 3-$\sigma$ constraints with $c \lesssim 1.35$ are still well in agreement with the string Swampland criteria. However, Stage-4 surveys such as Euclid, LSST and DESI, tightly constraining the equation of state $w(z)$, will start putting surviving quintessence models into tensions with the string Swampland criteria by demanding $c<0.4$. We further investigate whether any idealised futuristic survey will ever be able to give a decisive answer to the question whether the cosmological constant would be preferred over a time-evolving dark-energy model within the Swampland criteria. Hypothetical surveys with a reduction in the uncertainties by a factor of $\sim20$ compared to Euclid would be necessary to reveal strong tension between quintessence models obeying the string Swampland criteria and observations by pushing the allowed values down to $c<0.1$. In view of such perspectives, there will be fundamental observational limitations with future surveys.
SwamplandDark energyEuclid missionString theoryQuintessenceScalar fieldCosmological constantQuantum gravityEffective field theoryTheories of gravity...
• #### Bubble wall velocities in the Standard Model and beyond

We present results for the bubble wall velocity and bubble wall thickness during a cosmological first-order phase transition in a condensed form. Our results are for minimal extensions of the Standard Model but in principle are applicable to a much broader class of settings. Our first assumption about the model is that only the electroweak Higgs is obtaining a vacuum expectation value during the phase transition. The second is that most of the friction is produced by electroweak gauge bosons and top quarks. Under these assumptions the bubble wall velocity and thickness can be deduced as a function of two equilibrium properties of the plasma: the strength of the phase transition and the pressure difference along the bubble wall.
Bubble wallPhase transitionsHiggs bosonDeviations from equilibriumStandard ModelTop quarkElectroweakEffective potentialBoltzmann transport equationExtensions of the standard model...
• #### Resolving CO (2-1) in z~1.6 Gas-Rich Cluster Galaxies with ALMA: Rotating Molecular Gas Disks with Possible Signatures of Gas Stripping

We present the first spatially-resolved observations of molecular gas in a sample of cluster galaxies beyond z>0.1. Using ALMA, we detect CO (2-1) in 8 z~1.6 cluster galaxies, all within a single 70" primary beam, in under 3 hours of integration time. The cluster, SpARCS-J0225, is replete with gas-rich galaxies in close proximity. It thus affords an efficient multiplexing strategy to build up the first sample of resolved CO in distant galaxy clusters. Mapping out the kinematic structure and morphology of the molecular gas on 3.5 kpc scales reveals rotating gas disks in the majority of the galaxies, as evidenced by smooth velocity gradients. Detailed velocity maps also uncover kinematic peculiarities, including a central gas void, a merger, and a few one-sided gas tails. We compare the extent of the molecular gas component to that of the optical stellar component, measured with rest-frame optical HST imaging. We find that the cluster galaxies, while broadly consistent with a ratio of unity for stellar-to-gas effective radii, have a moderately larger ratio compared to the coeval field; this is consistent with the more pronounced trend in the low-redshift Universe. Thus, at first glance, the z~1.6 cluster galaxies generally look like galaxies infalling from the field, with typical main-sequence star formation rates and massive molecular gas reservoirs situated in rotating disks. However, there are potentially important differences from their field counterparts, including elevated gas fractions, slightly smaller CO disks, and possible asymmetric gas tails. Taken in tandem, these signatures are tentative evidence for gas-stripping in the z~1.6 cluster. However, the current sample size of spatially-resolved molecular gas in galaxies at high redshift is small, and verification of these trends will require much larger samples of both cluster and field galaxies.
GalaxyAtacama Large Millimeter ArrayKinematicsField galaxyStar formation rateMain sequence starHalf-light radiusCluster of galaxiesHubble Space TelescopeStar formation...
• #### Manifold Mixup: Encouraging Meaningful On-Manifold Interpolation as a Regularizerver. 2

Deep networks often perform well on the data manifold on which they are trained, yet give incorrect (and often very confident) answers when evaluated on points from off of the training distribution. This is exemplified by the adversarial examples phenomenon but can also be seen in terms of model generalization and domain shift. We propose Manifold Mixup which encourages the network to produce more reasonable and less confident predictions at points with combinations of attributes not seen in the training set. This is accomplished by training on convex combinations of the hidden state representations of data samples. Using this method, we demonstrate improved semi-supervised learning, learning with limited labeled data, and robustness to adversarial examples. Manifold Mixup requires no (significant) additional computation. Analytical experiments on both real data and synthetic data directly support our hypothesis for why the Manifold Mixup method improves results.
ManifoldHidden stateSemi-supervised learningRegularizationHidden layerMachine learningArchitectureGenerative Adversarial NetConvex combinationNearest-neighbor site...
• #### Analysis of the generalization error: Empirical risk minimization over deep artificial neural networks overcomes the curse of dimensionality in the numerical approximation of Black-Scholes partial differential equations

The development of new classification and regression algorithms based on empirical risk minimization (ERM) over deep neural network hypothesis classes, coined Deep Learning, revolutionized the area of artificial intelligence, machine learning, and data analysis. More recently, these methods have been applied to the numerical solution of high dimensional PDEs with great success. In particular, recent simulations indicate that deep learning based algorithms are capable of overcoming the curse of dimensionality for the numerical solution of linear Kolmogorov PDEs. Kolmogorov PDEs have been widely used in models from engineering, finance, and the natural sciences. Nearly all approximation methods for Kolmogorov PDEs in the literature suffer under the curse of dimensionality. By contrast, in recent work by some of the authors it was shown that deep ReLU neural networks are capable of approximating solutions of Kolmogorov PDEs without incurring the curse of dimensionality. The present paper considerably strengthens these results by providing an analysis of the generalization error. In particular we show that for Kolmogorov PDEs with affine drift and diffusion coefficients and a given accuracy $\varepsilon>0$, ERM over deep neural network hypothesis classes of size scaling polynomially in the dimension $d$ and $\varepsilon^{-1}$ and with a number of training samples scaling polynomially in the dimension $d$ and $\varepsilon^{-1}$ approximates the solution of the Kolmogorov PDE to within accuracy $\varepsilon$ with high probability. We conclude that ERM over deep neural network hypothesis classes breaks the curse of dimensionality for the numerical solution of linear Kolmogorov PDEs with affine drift and diffusion coefficients. To the best of our knowledge this is the first rigorous mathematical result that proves the efficiency of deep learning methods for high dimensional problems.
Curse of dimensionalityNeural networkDeep learningDeep Neural NetworksActivation functionGeneralization errorRegressionEmpirical risk minimizationBlack-ScholesStatistical learning theory...
• #### Hamiltonian Descent Methods

We propose a family of optimization methods that achieve linear convergence using first-order gradient information and constant step sizes on a class of convex functions much larger than the smooth and strongly convex ones. This larger class includes functions whose second derivatives may be singular or unbounded at their minima. Our methods are discretizations of conformal Hamiltonian dynamics, which generalize the classical momentum method to model the motion of a particle with non-standard kinetic energy exposed to a dissipative force and the gradient field of the function of interest. They are first-order in the sense that they require only gradient computation. Yet, crucially the kinetic gradient map can be designed to incorporate information about the convex conjugate in a fashion that allows for linear convergence on convex functions that may be non-smooth or non-strongly convex. We study in detail one implicit and two explicit methods. For one explicit method, we provide conditions under which it converges to stationary points of non-convex functions. For all, we provide conditions on the convex function and kinetic energy pair that guarantee linear convergence, and show that these conditions can be satisfied by functions with power growth. In sum, these methods expand the class of convex functions on which linear convergence is possible with first-order computation.
HamiltonianDiscretizationOptimizationDissipationPicard-Lindelöf theoremLyapunov functionDualityPreconditionerConvex setFinite difference...
• #### Quantum chromodynamics through the geometry of M\"{o}bius structures

This paper describes a rigorous mathematical formulation providing a divergence free framework for QCD and the standard model in curved space-time. The starting point of the theory is the notion of covariance which is interpreted as (4D) conformal covariance rather than the general (diffeomorphism) covariance of general relativity. It is shown how the infinitesimal symmetry group (i.e. Lie algebra) of the theory, that is $su(2,2)$, is a linear direct sum of $su(3)$ and the algebra ${\mathfrak\kappa}\cong sl(2,{\bf C})\times u(1)$, these being the QCD algebra and the electroweak algebra. Fock space which is a graded algebra composed of Hilbert spaces of multiparticle states where the particles can be fermions such as quarks and electrons or bosons such as gluons or photons is described concretely. Algebra bundles whose typical fibers are the Fock spaces are defined. Scattering processes are associated with covariant linear maps between the Fock space fibers which can be generated by intertwining operators between the Fock spaces. It is shown how quark-quark scattering and gluon-gluon scattering are associated with kernels which generate such intertwining operators. The rest of the paper focusses on QCD vacuum polarization in order to compute and display the running coupling constant for QCD at different scales. Through an easy application of the technique called the spectral calculus the densities associated with the quark bubble and the gluon bubble are computed and hence the QCD vacuum polarization function is determined. It is found that the QCD running coupling constant has non-trivial behavior particularly at the subnuclear level. Asymptotic freedom and quark confinement are proved.
InfinitesimalFock spaceCovarianceVacuum polarizationElectroweakSymmetry groupDiffeomorphismCoupling constantGeneral relativityQCD vacuum...
• #### A note on three-point functions of unprotected operators

Given the recent progress in computing three-point functions in N=4 SYM via integrability, I provide here a novel direct calculation of some structure constants at weak coupling. The main focus is on correlators involving more than one unprotected operator, at two-loop order in the perturbative expansion.
Anomalous dimensionPerturbative expansionPropagatorOperator product expansionDimensional regularizationRenormalizationSuper Yang-Mills theoryHigher spinMomentum spaceGegenbauer polynomials...
• #### DreamNLP: Novel NLP System for Clinical Report Metadata Extraction using Count Sketch Data Streaming Algorithm: Preliminary Results

Extracting information from electronic health records (EHR) is a challenging task since it requires prior knowledge of the reports and some natural language processing algorithm (NLP). With the growing number of EHR implementations, such knowledge is increasingly challenging to obtain in an efficient manner. We address this challenge by proposing a novel methodology to analyze large sets of EHRs using a modified Count Sketch data streaming algorithm termed DreamNLP. By using DreamNLP, we generate a dictionary of frequently occurring terms or heavy hitters in the EHRs using low computational memory compared to conventional counting approach other NLP programs use. We demonstrate the extraction of the most important breast diagnosis features from the EHRs in a set of patients that underwent breast imaging. Based on the analysis, extraction of these terms would be useful for defining important features for downstream tasks such as machine learning for precision medicine.
Computational linguisticsPrecisionCountingMachine learningAlgorithmsSketch...
• #### Multiparametric Deep Learning Tissue Signatures for a Radiological Biomarker of Breast Cancer: Preliminary Results

A new paradigm is beginning to emerge in Radiology with the advent of increased computational capabilities and algorithms. This has led to the ability of real time learning by computer systems of different lesion types to help the radiologist in defining disease. For example, using a deep learning network, we developed and tested a multiparametric deep learning (MPDL) network for segmentation and classification using multiparametric magnetic resonance imaging (mpMRI) radiological images. The MPDL network was constructed from stacked sparse autoencoders with inputs from mpMRI. Evaluation of MPDL consisted of cross-validation, sensitivity, and specificity. Dice similarity between MPDL and post-DCE lesions were evaluated. We demonstrate high sensitivity and specificity for differentiation of malignant from benign lesions of 90% and 85% respectively with an AUC of 0.93. The Integrated MPDL method accurately segmented and classified different breast tissue from multiparametric breast MRI using deep leaning tissue signatures.
Deep learningAutoencoderClassificationBreast MRIMagnetic resonance imagingNetworksAlgorithms...
• #### Unsupervised Non Linear Dimensionality Reduction Machine Learning methods applied to Multiparametric MRI in cerebral ischemia: Preliminary Results

The evaluation and treatment of acute cerebral ischemia requires a technique that can determine the total area of tissue at risk for infarction using diagnostic magnetic resonance imaging (MRI) sequences. Typical MRI data sets consist of T1- and T2-weighted imaging (T1WI, T2WI) along with advanced MRI parameters of diffusion-weighted imaging (DWI) and perfusion weighted imaging (PWI) methods. Each of these parameters has distinct radiological-pathological meaning. For example, DWI interrogates the movement of water in the tissue and PWI gives an estimate of the blood flow, both are critical measures during the evolution of stroke. In order to integrate these data and give an estimate of the tissue at risk or damaged, we have developed advanced machine learning methods based on unsupervised non-linear dimensionality reduction (NLDR) techniques. NLDR methods are a class of algorithms that uses mathematically defined manifolds for statistical sampling of multidimensional classes to generate a discrimination rule of guaranteed statistical accuracy and they can generate a two- or three-dimensional map, which represents the prominent structures of the data and provides an embedded image of meaningful low-dimensional structures hidden in their high-dimensional observations. In this manuscript, we develop NLDR methods on high dimensional MRI data sets of preclinical animals and clinical patients with stroke. On analyzing the performance of these methods, we observed that there was a high of similarity between multiparametric embedded images from NLDR methods and the ADC map and perfusion map. It was also observed that embedded scattergram of abnormal (infarcted or at risk) tissue can be visualized and provides a mechanism for automatic methods to delineate potential stroke volumes and early tissue at risk.
Machine learningManifoldMagnetic resonance imagingAlgorithmsPotential...
• #### Damn You, Little h! (or, Real-World Applications Of The Hubble Constant Using Observed And Simulated Data)ver. 3

The Hubble constant, H0, or its dimensionless equivalent, "little h", is a fundamental cosmological property that is now known to an accuracy better than a few percent. Despite its cosmological nature, little h commonly appears in the measured properties of individual galaxies. This can pose unique challenges for users of such data, particularly with survey data. In this paper we show how little h arises in the measurement of galaxies, how to compare like-properties from different datasets that have assumed different little h cosmologies, and how to fairly compare theoretical data with observed data, where little h can manifest in vastly different ways. This last point is particularly important when observations are used to calibrate galaxy formation models, as calibrating with the wrong (or no) little h can lead to disastrous results when the model is later converted to the correct h cosmology. We argue that in this modern age little h is an anachronism, being one of least uncertain parameters in astrophysics, and we propose that observers and theorists instead treat this uncertainty like any other. We conclude with a "cheat sheet" of nine points that should be followed when dealing with little h in data analysis.
CosmologyHubble constantCalibrationHubble lawGalaxyGalaxy FormationLuminosityStellar massHubble timeAbsolute magnitude...
• #### Bit-Metric Decoding of Non-Binary LDPC Codes with Probabilistic Amplitude Shaping

A new approach for combining non-binary low-density parity-check (NB-LDPC) codes with higher-order modulation and probabilistic amplitude shaping (PAS) is presented. Instead of symbol-metric decoding (SMD), a bit-metric decoder (BMD) is used so that matching the field order of the non-binary code to the constellation size is not needed, which increases the flexibility of the coding scheme. Information rates, density evolution thresholds and finite-length simulations show that the flexibility comes at no loss of performance if PAS is used.
ConstellationsGalois fieldMutual informationRankNumerical simulationMonte Carlo methodArchitectureConjunctionQuadratureSphere packing...
• #### A note on reducing the computation time for minimum distance and equivalence check of binary linear codes

In this paper we show the usability of the Gray code with constant weight words for computing linear combinations of codewords. This can lead to a big improvement of the computation time for finding the minimum distance of a code. We have also considered the usefulness of combinatorial $2$-$(t,k,1)$ designs when there are memory limitations to the number of objects (linear codes in particular) that can be tested for equivalence.
ClassificationSoftwareBit arrayGalois fieldIntersection numberCryptographyPermutationObjectAlgorithmsCommunication...