Internet Measurement Conference 2012: by Hamed Haddadi

IMC 2012.

Paper 1: Using CAIDA telescope to collect scans of sip services, the scans have a sophisticated pattern which survives even when there are a low number of hosts taking part in the bot. The paper is a good demonstration of solid measurement work like many other IMC paper
Also Amazing animation (CAIDA cuttlefish )

Paper 2: prefix hijacking detection has been attracting a lot of attention lately. They form a live fingerprint of the route update distribution patterns and identify and classify hijacks, failure and route anomalies using threshold techniques and distributed “eyes” , with less than 10second delay.

Paper4: looking at one way traffic on the net, which can shed light on a large number of anomalies, they use a large netflow dataset for analysis of these packets (which never receive a reply). Interestingly, over 7 years of their data, a significant portion of flows is one-way 30–70% but a very low volume of traffic as they are usually small packets.

Paper3: the paper focuses on concurrent prefix hijacks, where an AS hijacks prefixes of a number of other ASes. These are becoming trendy as full table leaks are difficult and detected faster. It is also a big task to remove individual valid changes in AS prefixes. There are a number of interesting case studies in the paper.

Paper2 morning session , fast classification at wire speed with commodity hardware , the paper has an interesting analysis of pros and cons of speed vs accuracy , number of cores, amount of memory, they have used synthetic and real traces from CAIDA. The optimal classification can be done when there is one core dedicated per queue.

1. Fathom: A Browser-based Network Measurement Platform (review)
Mohan Dhawan (Rutgers University), Justin Samuel (UC Berkeley), Renata Teixeira (CNRS & UPMC), Christian Kreibich, Mark Allman, and Nicholas Weaver (ICSI), and Vern Paxson (ICSI & UC Berkeley)
Interesting measurement methodology using Firefox extension

Transition to ipv6, they have used javascript websites and flash googled flash ads to try out number of observed networks which have ipv6 enabled , though using these has introduced very interesting biases towards Asian and Latin American countries. They notice that no one is taking action for adoption. A high proportion of 6to4 tunnelling is seen, and corporate networks seem to be leading the way in adoption. The findings indicate a delay for Teredo hence Microsoft hasn’t enabled it by default
The sampling technique has been very interesting in the paper.

3. MAPLE: A Scalable Architecture for Maintaining Packet Latency Measurements (review)
Myungjin Lee (Purdue University), Nick Duffield (AT&T Labs-Research), and Ramana Rao Kompella (Purdue University)
Another tool paper, specific for latency measurements, moves to per packet granularity to obtain measurements of latency at packet level rather than flow level. They use time stamped packets to keep track of packets using hash tables and a variant of bloom filters for efficiency.

4. Can you GET me now? Estimating the Time-to-First-Byte of HTTP transactions with Passive Measurements (review) (short paper)
Emir Halepovic, Jeffrey Pang, and Oliver Spatscheck (AT&T Labs-Research)
Motivation is to measure user experienced delay, using passive analysis for convenience and representativeness, defining ttfb as time between sun ack and first byte of http data. They show ttfb captures user experience better than rtt.

5. Towards Geolocation of Millions of IP Addresses (review) (short paper)
Zi Hu, John Heidemann, and Yuri Pradkin
Improvements to popular maxmind geoloc system, in an open geoloc database format for all address. They use a vantage point system to triangulate IP address locations. Accuracy is preserved by choosing a number of vantage points.

1. Evolution of Social-Attribute Networks: Measurements, Modeling, and Implications using Google+ (review)
Neil Zhenqiang Gong (EECS, UC Berkeley), Wenchang Xu (CS, Tinghua University), Ling Huang (Intel Lab), Prateek Mittal (EECS, UC Berkeley), Vyas Sekar (Intel Lab), and Emil Stefanov and Dawn Song (EECS, UC Berkeley)
First large scale study of an OSN evolution. Breadth first search crawling, differentiating between followers and followee graphs, they design a new model base don the observation that google plus has a large number of low degree nodes, with a log-normal distribution. It’s makes google plus a hybrid between Facebook and twitter. They also look at Triassic closure model and find them better then preferential attachment.
Surprised why they didn’t check the correlation between number of posts and degree of nodes, also maybe attribute such as LinkedIn endorsed skills play a role in this relationship.

2. Evolution of a Location-based Online Social Network: Analysis and Models (review)
Miltiadis Allamanis, Salvatore Scellato, and Cecilia Mascolo (University of Cambridge)
Looking at spatial and location based social networks. Using daily snapshots of gowalla social networks looking at check ins of 122k users, they explore global attachment models such as preferential attachment model, age model , distance model and the gravity model. 30% of new edges are between users that have one check in in common.

3. New Kid on the Block: Exploring the Google+ Social Graph (review)
Gabriel Magno and Giovanni Comarela (Federal University of Minas Gerais), Diego Saez-Trumper (Universitat Pompeu Fabra), Meeyoung Cha (Korea Advanced Institute of Science and Technology), and Virgilio Almeida (Federal University of Minas Gerais)
Another google+ paper, looking at information sharing and privacy settings in google plus, some users put private data such as home and mobile numbers, though own users are known to be more risk taking. A bunch of other metrics are also discussed, however the type of users are not discussed. Also they data shows the strong geographical correlation of friendship between users, showing that offline relationship is also reflected in the data. I imagine The data may have strong errors obviously, as some users put premium number in the phone field to collect money :)

4. Multi-scale Dynamics in a Massive Online Social Network (review)
Xiaohan Zhao (UC Santa Barbara), Alessandra Sala (Bell Labs, Ireland), Christo Wilson (UC Santa Barbara), Xiao Wang (Renren Inc.), Sabrina Gaito (Università degli Studi di Milano), and Haitao Zheng and Ben Y. Zhao (UC Santa Barbara)
Looking at volition of user activity and growth of network, using Chinese Facebook equivalent , capturing node and edge dynamics over 2 years. Network growth and effect of aGe of nodes and preferential attachment, and how do these change as network matures. They also look at community formation and theirs lifetime and similarity using set intersection and jacquard coefficient. Driving force behind edge creation shifts from new nodes to old nodes as network grows, preferential attachment strength also decays.

Day 2

8:30-10:15 Video On Demand. Session Chair: Mark Allman (ICSI)

1. Watching Video from Everywhere: a Study of the PPTV Mobile VoD System (review)

Zhenyu Li, Jiali Lin, Marc-Ismael Akodjenou-Jeannin, and Gaogang Xie (ICT, CAS), Mohamed Ali Kaafar (INRIA), and Yun Jin and Gang Peng (PPlive)
dataset from smartphone video videos of 4m users, watching 400k videos over two weeks, the results can be a good guide for those designing wireless provisioning, the trends of watching long videos versus short videos are displayed against time of day which is interesting. 3G users are more likely to wathc movies but they often give up at the beginning.

2. Program popularity and viewer behaviour in a large TV on demand system (review)

Henrik Abrahamsson (SICS) and Mattias Nordmark (TeliaSonera)

Looking at TV & video on-demand access patterns, the usual heavy tail and top 100 popularity trends can be seen. they find the cacheability very high so with 5% top videos cacheing, the hit rate increases 50%.

Video Stream Quality Impacts Viewer Behavior: Inferring Causality using Quasi-Experimental Designs (review)

S. Shanmuga Krishnan (Akamai Technologies) and Ramesh K. Sitaraman (University of Massachusetts, Amherst and Akamai Technologies)

nice introduction to video delivery economics, to improve user behaviour and performance. The performance aspects is understood, but the improved “user behaviour” is not clear. a LArge dataset of video views is presented. Using randomised experiments (Fisher 1937) they look at correlation vs causation of different factors such as geography and content times by treating users differently for example for re-buffering of videos and its effect on video abandonment. Patience is increased with the length of the videos. So short video clips are abandoned fast if they are slow to load. The mobile users are more patient than fiber users so access technology also plays a role.

Confused, Timid, and Unstable: Picking a Video Streaming Rate is Hard (review)

Te-Yuan Huang, Nikhil Handigol, Brandon Heller, Nick McKeown, and Ramesh Johari (Stanford University)

The performance of video rate over http/tcp is analysed. the competing flows makes the video rate to go too low which takes it below acceptable value. The on-off traffic pattern due to buffering heavily effects the congestion window management of TPC due to slow-start and hence bandwidth underestimation. This is due to video-client trying to do TCP’s job and estimating b/w. perhaps a video-specific protocol is needed??

On the Incompleteness of the AS-level graph: a Novel Methodology for BGP Route Collector Placement (review)
Enrico Gregori (IIT-CNR), Alessandro Improta (University of Pisa / IIT-CNR), Luciano Lenzini (University of Pisa), Lorenzo Rossi (IIT-CNR), and Luca Sani (IMT Lucca)

The paper shows the geographic distribution of feeders and and their coverage of AS topology dataset. they increase the accuracy by more route collectors, I believe (and Walter Willinger also mentioned) the work can heavily improve by using IXP data.

Quantifying Violations of Destination-based Forwarding on the Internet (review) (short paper)

Tobias Flach, Ethan Katz-Bassett, and Ramesh Govindan (University of Southern California)

Using reverse traceroute for finding destination-based forwarding violations, e.g., by MPLS tunnels or load balancing, using planetlab nodes and destinations with spoofed packets along paths. large portion of violations are caused by load balancing. for 29% of the targeted routers, the router forwards traffic going to a single destination via different next hops, and 1.3% of the routers even select next hops in differ- ent ASes.

Revisiting Broadband Performance (review)

Igor Canadi and Paul Barford (University of Wisconsin) and Joel Sommers (Colgate University)

growing interest in broadband subscription and FCC interest in investigation broadband speeds and rates, they use Ookla data which is a flash-based performance testing application, with over 700 server locations. the paper uses 59 metro areas across the world for segmenting the areas based on geographic diversity. the comparison of data is made against SamKnows data. Some ISPs are seen to be rate-limiting users to very low speeds,

Obtaining In-Context Measurements of Cellular Network Performance (review)

Aaron Gember and Aditya Akella (University of Wisconsin-Madison) and Jeffrey Pang, Alexander Varshavsky, and Ramon Caceres (AT&T Labs-Research)

checking performance of user devices for different conditions, crowd sourcing using 12 volunteers to measure performance of cellular networks, using speed test websites for looking at latency and loss over different hours of day, they look at different situations and positions of the phone however different data delivery types can affect this result quite heavily.

Cell vs. WiFi: On the Performance of Metro Area Mobile Connections (review)

Joel Sommers (Colgate University) and Paul Barford (University of Wisconsin)

another mobile performance measurement and speed test crowd source data collection, also from native apps on smart phones. iOS devices show more latency compared to android devices, perhaps due to poor OS or API design. they find performance of wifi better but cellular is more consistent

Network Performance of Smart Mobile Handhelds in a University Campus WiFi Network (review)

Xian Chen and Ruofan Jin (University of Connecticut), Kyoungwon Suh (Illinois State University), and Bing Wang and Wei Wei (University of Connecticut)

an interesting paper comparing CDN performance between Akamai and Google on campus

1. Breaking for Commercials: Characterizing Mobile Advertising (review)

Narseo Vallina-Rodriguez and Jay Shah (University of Cambridge), Alessandro Finamore (Politecnico di Torino), Hamed Haddadi (Queen Mary, University of London), Yan Grunenberger and Konstantina Papagiannaki (Telefonica Research), and Jon Crowcroft (University of Cambridge)

BEST paper! read it fully! :)

Screen-Off Traffic Characterization and Optimization in 3G/4G Networks (review)(short paper)

Junxian Huang, Feng Qian, and Z. Morley Mao (University of Michigan) and Subhabrata Sen and Oliver Spatscheck (AT&T Labs-Research)

collecting data from 20 volunteers on android for 5 months, looking at screen status at 1Hz. screen off traffic consumes half of energy on network interface because applications download less and traffic pattern changes, screen-aware fast dormancy increases energy saving by 15%.

Configuring DHCP Leases in the Smartphone Era (review) (short paper)

Ioannis Papapanagiotou (North Carolina State University) and Erich M Nahum and Vasileios Pappas (IBM Research)

using a big trace to look at DHCP lease duration and lifetime in corporate and academic environment

Video Telephony for End-consumers: Measurement Study of Google+, iChat, and Skype (review)

Yang Xu, Chenguang Yu, Jingjiang Li, and Yong Liu (Polytechnic Institute of NYU)

this actually won the best paper award, recommend reading it! they show the effect of video and voice processing on e2e delay, they also present the techniques used for scalability

On Traffic Matrix Completion in the Internet (review)

Gonca Gursun and Mark Crovella (Boston University)

the idea is to reverse engineer traffic matrices to detect invisible flows (going through other networks), using AS topology and traffic matrices in ASes using matrix completion method.

DNS to the rescue: Discerning Content and Services in a Tangled Web (review)

Ignacio Bermudez, Marco Mellia, and Maurizio Munafo` (Politecnico di Torino) and Ram Keralapura and Antonio Nucci (Narus Inc.)

Interesting paper about the complex content delivery chain in the internet, and a service which helps classify the type of contents.

Beyond Friendship: Modeling User Activity Graphs on Social Network-Based Gifting Applications (review)
Atif Nazir, Alex Waagen, Vikram S. Vijayaraghavan, Chen-Nee Chuah, and Raissa D’Souza (UC Davis) and Balachander Krishnamurthy (AT&T Labs-Research)

aiming to model user activity on OSNs, using facebook apps data to look at user activity, power-law fits are seen for in-degrees but out degree has strong heavy tail, the node activity has to be modelled from connectivity.

Inside Dropbox: Understanding Personal Cloud Storage Services (review)

Idilio Drago (University of Twente), Marco Mellia and Maurizio M. Munafo (Politecnico di Torino), and Anna Sperotto, Ramin Sadre, and Aiko Pras (University of Twente)

Looking at dropbox data storage and file storage system, which splits the file into 4M chunks and encrypted communication, there is a communication separation between storage and control, dropbox seems to be a very popular app, mainly used by native client, experiments using planetlab shows generally they all use same data centres in US (amazon data centre for data and control in california), the slicing in chunks means that many of the files are too small and do not use the bandwidth efficiently due to TPC slow start, so even for large files it means filling the channel capacity takes longer

Content delivery and the natural evolution of DNS (review)

John S. Otto, Mario A. Sánchez, John P. Rula, and Fabián E. Bustamante (Northwestern University)

discusses use of DNS for dynamic routing, and use of openDNS and googleDNS for these purposes. CDN depends on user DNS to directly requests. different redirections mean better performance , try out namehelp for proactive cacheing

Measuring the Deployment of IPv6: Topology, Routing and Performance (review)

Amogh Dhamdhere, Matthew Luckie, Bradley Huffaker, and kc Claffy (CAIDA), Ahmed Elmukashfi (Simula), and Emile Aben (RIPE)

IPv4 addresses have run out. IPv6 has been around but not used as not backwards compatible. Hence tunnelling has been the main growth area. used measurement data from BGP and AS relationships and lots of data, and classify ASes to transit providers, content/access/hosting providers and enterprise customers. They find that IPv6 is strong at core but lagging at edge. then measured AS level paths from 7 vantage points towards dual-stacked ASes. They find V4 network maturing, and transit providers dpeloying V6, same as content providers, the edge is lagging, with Europe and Asia leading