# Seminars archive

 Search keyword Select type Any typeGeneral interest (colloquium, job talks, ...)AlgebraAnalysis and Differential GeometryCombinatoricsComputer ScienceNumber TheoryNumerical Analysis and Scientific ComputingTopologyOther
 Upcoming Seminars Tue09/25/20184:00pm Seminar: AlgebraTitle to be announcedRenee Bell, University of PennsylvaniaContact: David Zureick-Brown, dzb@mathcs.emory.eduVenue: Mathematics and Science Center, Room W301Download printable flyer (PDF, 19.8 kB) Mon10/01/20184:00pm Seminar: CombinatoricsOn the Erdos-Gyarfas distinct distances problem with local constraintsCosmin Pohoata, The California Institute of TechnologyContact: Dwight Duffus, dwightduffus@emory.eduVenue: Mathematics and Science Center, Room E408Download printable flyer (PDF, 115 kB)Show abstractIn 1946 Erdos asked to determine or estimate the minimum number of distinct distances determined by an n-element planar point set V. He showed that a square integer lattice determines \Theta(n/\sqrt{log n}) distinct distances, and conjectured that any n-element point set determines at least n^{1−o(1)} distinct distances. In 2010-2015, Guth and Katz answered Erdos’s question by proving that any n-element planar point set determines at least \Omega(n/log n) distinct distances. In this talk, we consider a variant of this problem by Erdos and Gyarfas. For integers n, p, q with p \geq q \geq 2, determine the minimum number D(n,p,q) of distinct distances determined by a planar n-element point set V with the property that any p points from V determine at least q distinct distance. In a recent paper, Fox, Pach and Suk prove that when q = {p \choose 2} - p + 6, D(n,p,q) is always at least n^{8/7 - o(1)}. We will discuss a recent improvement of their result and some new bounds for a related (graph theoretic) Ramsey problem of Erdos and Shelah which arise. This is joint work with Adam Sheffer. Tue10/16/20184:00pm Seminar: AlgebraTitle to be announcedEva Bayer Fluckinger, EPFLContact: David Zureick-Brown, dzb@mathcs.emory.eduVenue: Mathematics and Science Center, Room W301Download printable flyer (PDF, 20.3 kB) Tue10/23/20184:00pm Seminar: AlgebraJoint Athens-Atlanta Number Theory SeminarLarry Rolen and Bianca Viray, Vanderbilt and University of WashingtonContact: David Zureick-Brown, dzb@mathcs.emory.eduVenue: Mathematics and Science Center, Room W301Download printable flyer (PDF, 23.7 kB) Tue10/30/20184:00pm Seminar: AlgebraTitle to be announcedAnne Qu\'eguiner-Mathieu, ParisContact: David Zureick-Brown, dzb@mathcs.emory.eduVenue: Mathematics and Science Center, Room W301Download printable flyer (PDF, 19.6 kB) Tue11/27/20184:00pm Seminar: AlgebraTitle to be announcedNatalie Paquette, CaltechContact: David Zureick-Brown, dzb@mathcs.emory.eduVenue: Mathematics and Science Center, Room W301Download printable flyer (PDF, 19.3 kB) Past Seminars Mon04/03/20171:00pm Defense: DissertationZero-Cycles on Torsors under Linear Algebraic GroupsReed Sarney, Emory UniversityContact: Reed Sarney, rlgordo@emory.eduVenue: Mathematics and Science Center, Room W303Download printable flyer (PDF, 49 kB)Show abstractLet $k$ be a field, let $G$ be a smooth connected linear algebraic group over $k$, and let $X$ be a $G$-torsor. Totaro asked: if $X$ admits a zero-cycle of degree $d$, does $X$ have a closed {\'e}tale point of degree dividing $d$? We give a positive answer in two cases: \begin{enumerate} \item $G$ is an algebraic torus of rank $\leq 2$ and $\textup{ch}(k)$ is arbitrary, and \item $G$ is an absolutely simple adjoint group of type $A_1$ or $A_{2n}$ and $\textup{ch}(k) \neq 2$. \end{enumerate} We also present the first known examples where Totaro's question has a negative answer. Thu03/30/20174:00pm SeminarTheory for New Machine Learning Problems and ApplicationsYingyu Liang, Princeton UniversityContact: James Lu, jlu@mathcs.emory.eduVenue: Mathematics and Science Center, Room W201Download printable flyer (PDF, 43.4 kB)Show abstractMachine learning has recently achieved great empirical success. This comes along with new challenges, such as sophisticated models that lack rigorous analysis, simple algorithms with practical success on hard optimization problems, and handling large scale datasets under resource constraints. In this talk, I will present some of my work in addressing such challenges.\\ \\This first part of the talk focuses on learning semantic representations for text data. Recent advances in natural language processing build upon the approach of embedding words as low dimensional vectors. The fundamental observation that empirically justifies this approach is that these vectors can capture semantic relations. A probabilistic model for generating text is proposed to mathematically explain this observation and existing popular embedding algorithms. It also reveals surprising connections to classical notions such as Pointwise Mutual Information, and allows to design novel, simple, and practical algorithms for applications such as sentence embedding.\\ \\In the second part, I will describe my work on distributed unsupervised learning over large-scale data distributed over different locations. For the prototypical tasks clustering, Principal Component Analysis (PCA), and kernel PCA, I will present algorithms that have provable guarantees on solution quality, communication cost nearly optimal in key parameters, and strong empirical performance. \\ \\Bio: Yingyu Liang is an associate research scholar in the Computer Science Department at Princeton University. His research interests are providing rigorous analysis for machine learning models and designing efficient algorithms for applications. He received a B.S. in 2008 and an M.S. in 2010 in Computer Science from Tsinghua University, and a Ph.D. degree in Computer Science from Georgia Institute of Technology in 2014. He was a postdoctoral researcher in 2014-2016 in the Computer Science Department at Princeton University. Thu03/30/20171:00pm Defense: Honors ThesisThe Artin-Schreier Theorem in Galois TheoryYining Cheng, Emory UniversityVenue: Mathematics and Science Center, Room W303Download printable flyer (PDF, 37.3 kB)Show abstractWe first list and state some basic definitions and theorems of the Galois theory of finite extensions, as well as state and prove the Kummer theory and the Artin-Schreier extensions as prerequisites. The main part of this thesis is the proof of the Artin-Schreier Theorem, which states that an algebraic closed field having finite extension with its subfield F has degree at most two and F must have characteristic 0. After the proof, we will discuss the applications for the Artin-Schreier Theorem. Thu03/30/201710:00am Defense: DissertationScalable Computational pathology: From Interactive to Deep LearningMichael Nalisnik, Emory UniversityContact: Lee Cooper, lee.cooper@emory.eduVenue: Mathematics and Science Center, Room W306Download printable flyer (PDF, 41.5 kB)Show abstractAdvances in microscopy imaging and genomics have created an explosion of patient data in the pathology domain. Whole-slide images of histologic sections contain rich information describing the diverse cellular elements of tissue microenvironments. These images capture, in high resolution, the visual cues that have been the basis of pathologic diagnosis for over a century. Each whole-slide image contains billions of pixels and up to a million or more microanatomic objects whose appearances hold important prognostic information. Combining this information with genomic and clinical data provides insight into disease biology and patient outcomes. Yet, due to the size and complexity of the data, the software tools needed to allow scientists and clinicians to extract insight from these resources are non-existent or limited. Additionally, current methods utilizing humans is highly subjective and not repeatable. This work aims to address these shortcomings with a set of open-source computational pathology tools which aim to provide scalable, objective and repeatable classification of histologic entities such as cell nuclei.\\ \\ We first present a comprehensive interactive machine learning framework for assembling training sets for the classification of histologic objects within whole-slide images. The system provides a complete infrastructure capable of managing the terabytes worth of images, object features, annotations and metadata in real-time. Active learning algorithms are employed to allow the user and system to work together in an intuitive manner, allowing the efficient selection of samples from unlabeled pools of objects numbering in the hundreds of millions. We demonstrate how the system can be used to phenotype microvascular structures in gliomas to predict survival, and to explore the molecular pathways associated with these phenotypes. Quantitative metrics are developed to describe these structures.\\ \\ We also present a scalable, high-throughput, deep convolutional learning framework for the classification of histologic objects is presented. Due to its use of representation learning, the framework does not require the images to be segmented, instead learning optimal task-specific features in an unbiased manner. Addressing scalability, the graph-based, parallel architecture of the framework allows for the processing of large whole-slide image archives consisting of hundreds of slides and hundreds of millions of histologic objects. We explore the efficacy of various deep convolutional network architectures and demonstrate the system's capabilities classifying cell nuclei in lower grade gliomas. Wed03/29/20174:00pm Seminar: Computer ScienceUtility-cost of provable privacy: A case study on US Census data.Ashwin Machanavajjhala, Duke UniversityContact: Li Xiong, lxiong@emory.eduVenue: Mathematics and Science Center, Room W303Download printable flyer (PDF, 43.8 kB)Show abstractPrivacy is an important constraint that algorithms must satisfy when analyzing sensitive data from individuals. Differential privacy has revolutionized the way we reason about privacy, and has championed the need for data analysis algorithms with provable privacy guarantees. Differential privacy and its variants have arisen as the gold standard for exploring the tradeoff between the privacy ensured to individuals and the utility of the statistical insights mined from the data, and are in use by many commercial (e.g., Google and Apple) and government entities (e.g., US Census) for collecting and sharing sensitive user data.\\ \\ In today's talk I will highlight key challenges in designing differentially private algorithms for emerging applications, and highlight research from our group that try to address these challenges. In particular I will describe our recent work on modernizing the data publication process for a US Census Bureau data product, called LODES/OnTheMap. In this work, we identified legal statutes and their current interpretations that regulate the publication of LODES/OnTheMap data, formulated these regulations mathematically, and designed algorithms for releasing tabular summaries that provably ensured these privacy requirements. Our solutions are able to release summaries of the data with error comparable or even better than current releases (which are not provably private), for reasonable settings of privacy parameters.\\ \\ Bio: Ashwin Machanavajjhala is an Assistant Professor in the Department of Computer Science, Duke University. Previously, he was a Senior Research Scientist in the Knowledge Management group at Yahoo! Research. His primary research interests lie in algorithms for ensuring privacy in statistical databases and augmented reality applications. He is a recipient of the National Science Foundation Faculty Early CAREER award in 2013, and the 2008 ACM SIGMOD Jim Gray Dissertation Award Honorable Mention. Ashwin graduated with a Ph.D. from the Department of Computer Science, Cornell University and a B.Tech in Computer Science and Engineering from the Indian Institute of Technology, Madras. Wed03/29/20171:00pm Defense: HonorsA Study of Benford's Law for the Values of Arithmetic FunctionsLetian Wang, Emory UniversityContact: Letian Wang, letian.wang@emory.eduVenue: Mathematics and Science Center, Room E408Download printable flyer (PDF, 42.1 kB)Show abstract"Benford's Law characterizes the distribution of initial digits in large datasets across disciplines. Since its discovery by Simon Newcomb in 1881, Benford's Law has triggered tremendous studies. In this paper, we will start by introducing the history of Benford's Law and discussing in detail the explanations proposed by mathematicians on why various datasets are Benford. Such explanations include the Spread Hypothesis, the Geometric, the Scale-Invariance, and the Central Limit explanations. "To rigorously de ne Benford's Law and to motivate criteria for Benford sequences, we will provide fundamental theorems in uniform distribution modulo 1 in Chapter 2. We will state and prove criteria for checking uniform distribution, including Weyl's Criterion, Van der Corput's Di erence Theorem, as well as their corollaries.\\ \\"In Chapter 3, we will introduce the logarithm map, which allows us to reformulate Benford's Law with uniform distribution modulo 1 studied earlier. We will start by examining the case of base 10 only and then generalize to arbitrary bases. "Finally, we will elaborate on the idea of good functions. We will prove that good functions are Benford, which in turn enables us to nd a new class of Benford sequences. We will use this theorem to show that the partition function p(n) and the factorial sequence n! follow Benford's Law." Wed03/29/201711:30am Defense: DissertationProtecting Locations of Individual Movement under Temporal CorrelationsYonghui Xiao, Emory UniversityContact: Yonghui Xiao, yonghui.xiao@emory.eduVenue: Chemistry 320Download printable flyer (PDF, 44.8 kB)Show abstractConcerns on location privacy frequently arise with the rapid development of GPS enabled devices and location-based applications. In this dissertation, we study how to protect the locations of individual movement under temporal correlations. First, we propose three types of privacy notions, location privacy, customizable privacy, and the privacy for spatiotemporal events. Location privacy is used to protect the true location of a user at each timestamp; Customizable privacy means the user can configure personalized privacy notions depending different demand; Privacy for spatiotemporal events needs to be specially preserved because even if location privacy is guaranteed at each timestamp the spatiotemporal events can still be exposed. Second, we investigate how to preserve these privacy notions. We show that the traditional $\ell_{1}$-norm sensitivity in differential privacy exaggerates the real sensitivity, and thus leads to too much noise in the released data. Hence we study the real sensitivity, called sensitivity hull, for the data release mechanism. Then we design the optimal location release mechanism for location privacy. We show that the data release mechanism has to be dynamically updated for the customizable privacy to guarantee the privacy is protectable, which is measured by a notion of degree of protection. To protect the spatiotemporal events we study how to derive the probability of the spatiotemporal events for arbitrary initial probabilities of adversaries. Then we check whether to release the location at each timestamp to bound the risk of the spatiotemporal events. Third, we implement these algorithms on real-world datasets to demonstrate the efficiency and effectiveness. Tue03/28/20174:00pm Seminar: AlgebraFinite index for arboreal Galois representationsAndrew Bridy, Texas A and MContact: David Zureick-Brown, dzb@mathcs.emory.eduVenue: Mathematics and Science Center, Room W201Download printable flyer (PDF, 46.4 kB)Show abstractLet K be a global field of characteristic 0, let f in $K(x)$ and b in K, and set $K_n := K(f^{-n}(b))$. The projective limit of the groups $Gal(K_n/K)$ embeds in the automorphism group of an infinite rooted tree. A difficult problem is to find criteria that guarantee the index is finite; a complete answer would give a dynamical analogue of Serre's famous open image theorem. When f is a cubic polynomial over a function field, I prove a set of necessary and sufficient conditions for finite index (for number fields, the proof is conditional on Vojta's conjecture). This is joint work with Tom Tucker. Tue03/28/20172:45pm Defense: DissertationOn Saturation SpectrumJessica Fuller, EmoryContact: Jessica Fuller, jfulle@emory.eduVenue: Mathematics and Science Center, Room E406Download printable flyer (PDF, 37.5 kB)Show abstractGiven a graph H, we say a graph G is H-saturated if G does not contain H as a subgraph and the addition of any edge not already in G results in H as a subgraph. The question of the minimum number of edges of an H saturated graph on n vertices, known as the saturation number, and the question of the maximum number of edges possible of an H -saturated graph, known as the Turán number, has been addressed for many different types of graphs. We are interested in the existence of H -saturated graphs for each edge count between the saturation number and the Turán number. We determine the saturation spectrum of (Kt-e)-saturated graphs and Ft-saturated graphs. Let (Kt-e) be the complete graph minus one edge. We prove that (Kt-e)-saturated graphs do not exist for small edge counts and construct (Kt-e)-saturated graphs with edge counts in a continuous interval. We then extend the constructed (Kt-e)-saturated graphs to create (Kt-e)-saturated graphs. Let Ft be the graph consisting of t edge-disjoint triangles that intersect at a single vertex v. We prove that F2-saturated graphs do not exist for small edge counts and construct a collection of F2-saturated graphs with edge counts in a continuous interval. We also establish more general constructions that yield a collection of Ft-saturated graphs with edge counts in a continuous interval. Mon03/27/20174:00pm SeminarCompositional Models for Information ExtractionMark Dredze, Johns Hopkins UniversityContact: Eugene Agichtein, eugene@mathcs.emory.eduVenue: White Hall 207Download printable flyer (PDF, 36.8 kB)Show abstractInformation extraction systems are the backbone of many end-user applications, including question answering, web search and clinical text analysis. These applications depend on underlying technologies that can identify entities and relations as expressed in natural language text. For example, Amazon Echo may answer a user question based on a relation extracted from a news article. A clinical decision support system may offer a physician suggestions based on a symptom identified in the clinical notes from a previous patient visit. In political science, we may seek to aggregate opinions expressed in public comments about a new public policy. Advances in machine learning have led to new neural models for learning effective representations directly from data that improve information extraction tasks. Yet for many tasks, years of research have created hand-engineered features that yield state of the art performance. I will present feature-rich compositional models that combine both hand-engineered features with learned text representations to achieve new state-of-the-art results for relation extraction. These models are widely applicable to problems within natural language processing and beyond. Additionally, I will survey how these models fit into my broader research program by highlighting work by my group on developing new machine learning methods for extracting public health information from clinical and social media text.