Report on ACM Eurosys 2013 Conference

Technical University Library

This year Eurosys took place in Prague, Czech Republic. Eurosys is one of the main systems conferences. This year there were 28 accepted papers out of 143 (19% acceptance rate), a bit higher than previous years, but statistics did not consider submissions that did not fulfill the requirements, which has been counted in the past. Out of them only 6 european, and plenty of MSR papers.

Monday 15th April

Session 1: Large scale distributed computation I

TimeStream: Reliable Stream Computation in the Cloud

Chengping Qian (Microsoft Research Asia), Yong He (South China University of Technology), Chunzhi Su, Zhuojie Wu, and Hongyu Zhu (Shanghai Jiaotong University), Taizhi Zhang (Peking University), Lidong Zhou (Microsoft Research Asia), Yuan Yu (Microsoft Research Silicon Valley), and Zheng Zhang (Microsoft Research Asia)
Good motivation from MSR about why use stream processing: real-time heat map of latency pairwise in datacenter for network/ Infrastructure monitoring, real-time advertising, map queries, both current and previous ones, to the presented adverts.

Adverts must be reliable! more than monitoring (makes sense :) ).

Contribution focusses on resilience and fault tolerance. They build a DAG, rewritten dynamically, replacing link with hashed ones with several nodes. Not losing info, any subgraph can be substituted by an equivalent one, and reloads/recomputes missing pieces.

Several optimizations added, such as message batch aggregation, and lightweight dependency tracking from input/output, to estimate impact

Interesting work, and was well presented.

Optimus: A Dynamic Rewriting Framework for Execution Plans of Data-Parallel Computation

Qifa Ke, Michael Isard, and Yuan Yu (Microsoft Research Silicon Valley)

Motivation: there are many problems in large scale computations which cannot be known in advance. How to handle partition skew, what is the right number of tasks (e.g. Reducers?). Also, large scale matrix multiplication is argued it can cause problems with intermediate steps. Iterative co, or providing fault tolerance capabilities? The paper proposes to optimize the EPG (Execution Plan Graph) at runtime to solve these issues.

The solution for reliability is interesting,having a ‘cache’ for intermediate data, and choosing either data for next step.

BlinkDB: Queries with Bounded Errors and Bounded Response Times on Very Large Data

[BEST PAPER AWARD]

Sameer Agarwal (University of California, Berkeley), Barzan Mozafari (Massachusetts Institute of Technology), Aurojit Panda (University of California, Berkeley), Henry Milner (University of California, Berkeley), Samuel Madden (Massachusetts Institute of Technology), and Ion Stoica (University of California, Berkeley)

Motivation: compute aggregate statistics on huge amounts of data.

Select WITHIN 2 seconds. Results with error, can be refined . The idea is very elegant and original, and was brilliantly presented.

Estimation time is obtained by querying over small samples and extrapolating (should be linear). So, how well does a query cover the original one? Depends on what elements it has. Computes how complete the data will be, don’t get it too well. A sample has cost and coverage. ILP problem determines what samples to get.

Session 2: Security and Privacy

IFDB: Decentralized Information Flow Control for Databases

David Schultz and Barbara Liskov (MIT CSAIL)

Information flow control, by tagging database rows with labels, stored in an extra column. It’s far from my area, but I am surprised this has not been done already. Paper is well explained, and results show that the incurred overhead is small. A problem pointed in the questions is that it requires manual tagging, which is in general extremely hard to do right.

Process Firewalls: Protecting Processes During Resource Access

 Hayawardh Vijayakumar (The Pennsylvania State University), Joshua Schiffman (Advanced Micro Devices), and Trent Jaeger (The Pennsylvania State University)

There are many threats to file access, resource access cntrol is very hard. Programmer are not going to get it right. System call level protection carries a huge overhead.  Idea: reverse threat protection approach. With introspection you protect vulnerable processes, instead of sand boxing dangerous attackers. So, declare unsafe resources for a specific process context.

It is interesting that processing firewall rules is much more efficient that implementing checks manually (because it is prone to errors). Declarative wins apparently by a huge margin.

Resolving the conflict between generality and plausibility in verified computation

Srinath Setty, Benjamin Braun, Victor Vu, and Andrew J. Blumberg (UT Austin), Bryan Parno (Microsoft Research Redmond), and Michael Walfish (UT Austin)

They propose a Cryptographic technique, Probabilistically Checkable Proof (PCP), for checking whether a server indeed performed some computation. It is not yet usable because of the huge computation cost.

Session 3: Replication

ChainReaction: A Causal+ Consistent Datastore based on Chain Replication

Sergio Almeida, Joao Leitao, and Luıs Rodrigues (INESC-ID, Instituto Superior Tecnico, Universidade Tecnica de Lisboa)

ChainReaction is a Geo-distributed K/V store that implements the existing Causal+ model for improved read performance over existing causal replication-based systems. The work extends over FAWN, and adds capabilities for a geo-distributed setup. It was well presented, and design decisions and results are clearly reflected in the paper.

Augustus: Scalable and Robust Storage for Cloud Applications

Ricardo Padilha and Fernando Pedone (University of Lugano, Switzerland)

Bizantine Failure Tolerance (BFT) would be convenient in a cloud environment, but it heavily penalizes latency. The paper proposes single-partition transactions, and multi-partition read only transactions. The restrictions in applicability look a bit severe, but it was nicely validated across different workloads. Not sure that the social network workload they generated was representative, with them choosing an arbitrary 50% chance of a connection being close, with no justification.

MDCC: Multi-Data Center Consistency

Tim Kraska, Gene Pang, and Michael Franklin (UC Berkeley), Samuel Madden (MIT), and Alan Fekete (University of Sydney)

The authors present MDCC, a replication technique that attempts to exploit two main observations of geo distributed DBs: conflicting operations are commutative, and they are actually rare, as each client often updates their own data. With these, they implement a modified version of Paxos Multi + Fast, which attempts to lessen latency by reducing in several cases the number of phases. Results point in an extensive set of experiments to a significant performance improvement over other transactional databases.

Session 4: Concurrency and Parallelism

Conversion: Multi-Version Concurrency Control for Main Memory Segments 
Timothy Merrifield and Jakob Eriksson (University of Illinois at Chicago)

Cache control has become a main bottleneck in multi-core systems. Proposal: each process handles its own working copy for concurrent memory access. If processes can afford working with a slightly out of date copy, performance can be significantly improved.

Whose Cache Line Is It Anyway? Operating System Support for Live Detection and Repair of False Sharing 
Mihir Nanavati, Mark Spear, Nathan Taylor, Shriram Rajagopalan, Dutch T. Meyer, William Aiello, and Andrew Warfield (University of British Columbia)

Writes to the same cache line from multiple processes force to write everyone to main memory. Can have a huge impact on performance in many cases. Idea: split pages in an isolated page where conflicts are, and an underlay page with no conflicts.

Adaptive Parallelism for Web Search 
Myeongjae Jeon (Rice University), Yuxiong He (Microsoft Research), Sameh Elnikety (Microsoft Research), Alan L. Cox and Scott Rixner (Rice University)

In web search services (e.g. Bing), parallelism involves querying multiple index servers for results, and aggregating them with techniques such as PageRank. However, index server queries are sequential. The paper discusses the challenges of parallelizing in-server search. I don’t think it is novel, but it is well explained.

Tuesday 16th April

Session 1: Large scale distributed computation II

Mizan: A System for Dynamic Load Balancing in Large-scale Graph Processing

Zuhair Khayyat, Karim Awara, and Amani Alonazi (King Abdullah University of Science and Technology), Hani Jamjoom and Dan Williams (IBM T. J. Watson Research Center, Yorktown Heights), and Panos Kalnis (King Abdullah University of Science and Technology)

MIzan is a Pregel-based system, implemented in C++, which optimizes execution time by dynamically migrating vertices every iteration. Each node superstep execution is profiled, so that they statistically see what nodes perform slower, and if over threshold migrate dynamically. Every worker has a match worker with less load, where all migrations are headed too. Node locations are handled through a DHT. The technique is interesting in the sense that it completely ignores graph topology. It might incidentally reduce the number of cut edges, but does it by only looking at runtime statistics. However, the destination is chosen to load balance CPU, so if the network is the bottleneck it might not be enough.

The solution works, although validation has some problems, they only looked at graphs in the 2M range, which lend to questions about why not do it in a single machine. Comparisons where against a specific version of Giraph, which Greg Malewicz pointed out it had been vastly improved in the last six months.

MeT: Workload aware elasticity for NoSQL

Francisco Cruz, Francisco Maia, Miguel Matos, Rui Oliveira, Joao Paulo, Jose Pereira, and Ricardo Vilaca (HASLab / INESC TEC and U. Minho)

MeT is an HBase extension that performs elastic configuration of slaves. Depending on the observed access patterns, it provides replication and load balancing, Monitoring runtime statistics feeds a decision algorithm, identifying suboptimal config from CPU Usage. If a problem is detected, a distribution algorithm is run to spread the replicas.

Presto: Distributed Machine Learning and Graph Processing with Sparse Matrices

Shivaram Venkataraman (UC Berkeley), Erik Bodzsar (University of Chicago), and Indrajit Roy, Alvin AuYoung, and Robert S. Schreiber (HP Labs)

Presto is an R library that parallelizes matrix computations using distributed machines (darray construct, supporting for each). The API is lightweight, and requires minimal modification of R code. When paralellizing sparse matrices, Presto attempts to avoid the impact of skew in sparse matrices partitions.

Online repartitioning scheme, profiling each partition for optimizing further iteratons, if ratio higher than threshold, split problematic partition. It is integrated to R without modifying it by  hacking memory allocation including object headers. The contribution is not major, but is is well thought and described.

Session 2: Operating Systems Implementation

RadixVM: Scalable address spaces for multithreaded applications

Austin T. Clements, Frans Kaashoek, and Nickolai Zeldovich (MIT CSAIL)

Failure-Atomic msync(): A Simple and Efficient Mechanism for Preserving the Integrity of Durable Data

Stan Park (University of Rochester), Terence Kelly (HP Labs), and Kai Shen (University of Rochester)

Composing OS extensions safely and efficiently with Bascule

Andrew Baumann (Microsoft Research), Dongyoon Lee (University of Michigan), Pedro Fonseca (MPI Software Systems), and Jacob R. Lorch, Barry Bond, Reuben Olinsky, and Galen C. Hunt (Microsoft Research)

Session 3: Miscellaneous

Hypnos: Understanding and Treating Sleep Conflicts in Smartphone

Abhilash Jindal, Abhinav Pathak, Y. Charlie Hu, and Samuel Midkiff (Purdue University)

They analyze several sleep conflicts when the state machine of the smartphone fails, and the device is not effectively suspended. Tested on Nexus One and Galaxy S devices (3+ year old). Looks a bit weak.

Prefetching Mobile Ads: Can advertising systems afford it?

Prashanth Mohan (UC Berkeley) and Suman Nath and Oriana Riva (Microsoft Research)

It is MS data of course, but… just go to Breaking for Commercials: Characterizing Mobile Advertising

Maygh: Building a CDN from client web browsers

Liang Zhang, Fangfei Zhou, Alan Mislove, and Ravi Sundaram (Northeastern University)

Maygh is a web-based CDN implemented with HTML 5.  Content is cached through HTML 5 LocalStorage, 5 MB with programmatic control. . Novelty, no client modification at all (browser, not plugins such as firecoral). Implemented with RTMFP (Flash), WebRTC, key is NAT traversal via STUN.

Architecture is based on a proxy, the Maigh coordinator, maintaning a directory for content, via hashing. The idea works in principle, it has not been developed beyond a proof of concept (scalabiilty, security are not addressed properly). Interesting read.

Wednesday 17th April

Session 1: Virtualization

hClock: Hierarchical QoS for Packet Scheduling in a Hypervisor

Jean-Pascal Billaud and Ajay Gulati (VMware, Inc.)

RapiLog: Reducing System Complexity Through Verification

Gernot Heiser, Etienne Le Sueur, Adrian Danis, and Aleksander Budzynowski (NICTA and UNSW) and Tudor-Ioan Salomie and Gustavo Alonso (ETH Zurich)

Application Level Ballooning for Efficient Server Consolidation

Tudor-Ioan Salomie, Gustavo Alonso, and Timothy Roscoe (ETH Zurich) and Kevin Elphinstone (UNSW and NICTA)

Currently many applications (e.g. databases) are not designed to work fairly in a virtualized environment, where resources such as memory are shared and dynamically assigned. instead, they grab hold of the resources, whereas in many cases they could be working with much less memory and perform similarly.

The paper proposes a technique for ‘ballooning’ applications, so that the amount of memory assigned to them can be expanded, or squeezed. It requires modification of the applications and was developed for MySQL and the OpenJDK JVM, over Xen. Very interesting paper.

Session 2: Scheduling and performance isolation

Omega: flexible, scalable schedulers for large compute clusters

[BEST STUDENT PAPER AWARD]

Malte Schwarzkopf (University of Cambridge Computer Laboratory), Andy Konwinski (University of California Berkeley), and Michael Abd-el-Malek and John Wilkes (Google Inc.)

Omega is the upcoming scheduler for Google datacenters. It performs heterogeneous scheduling, segregating types of jobs, batch and services, with priority for batch (as they are orders of magnitude more). Solution: multiple schedulers, with shared state, and optimistic concurrency . They had to add some optimizations because constraints cause an enormous number of conflicts up when simulating realistic scenarios. The approach allows having  custom schedulers per application type, and they show an example that improves MR scheduling playing around the pattern in user preferences. Very good paper, well-deserved award.

Choosy: Max-Min Fair Sharing for Datacenter Jobs with Constraints

Ali Ghodsi, Matei Zaharia, Scott Shenker, and Ion Stoica (UC Berkeley)

Same problem, resource allocation in multitenant datacenters. They apply a constrained max min algorithm to guarantee fairness. The algo recursively maximizes allocation for the user with fewest machines. Nice flow filling model. Offline optimum, online approximation. Very interesting contrast with the previous paper, one very practical, based on the real Google workload, and this one much more academic in the resource allocation problem.

CPI2: CPU performance isolation for shared compute clusters

Xiao Zhang, Eric Tune, Robert Hagmann, Rohit Jnagal, vrigo Gokhale, and John Wilkes (Google, Inc.)

Performance isolation does not have perfectly in practice because of all the contending resources such as cache memory. They have implemented detection of problematic processes by monitoring hw performance counters, and ideally throttle the culprits.

Report from IEEE WCNC 2012

IEEE WCNC is the world premier wireless event that brings together industry professionals, academics, and individuals from government agencies and other institutions to exchange information and ideas on the advancement of wireless communications and networking technology. The 2012 edition of this conference was held in Paris, France, on 1 to 4 April 2012. The paper “Dynamic Frequency Allocation and Network Reconfiguration on Relay Based Cellular Network” worked by Dr.Haibo Mei, Dr. John Bigham and Dr. Peng Jiang was accepted in the Mobile and Wireless Networks Program of this conference. The abstract of this paper is in Appendix1.

During the conference trip,it was exciting to meet lots of outstanding researchers and developers in wireless communication field. There were numbers of   key note speeches and presentation sessions. The author received   some valuable comments during the presentation. All these experiences help authors’ further research in the wireless communication filed. Especially, the works “On Fractional Frequency Reuse in Imperfect Cellular Grids”,  “Energy-Efficient Subchannel Allocation Scheme Based on Adaptive Base Station Cooperation in Downlink Cellular Networks”, “Optimized Dual Relay Deployment for LTE-Advanced Cellular Systems” are mostly interesting to the author.

This conference trip is  valuable. It is always fantastic to have such chance to exchange idea with other researchers.

 

Appendix 1:

Abstract—Relay Based Cellular Networks (RBCNs) are a key development in cellular networking technology. However, because of ever increasing demand and base station failure, RBCNs still suffer from user congestion and low resilience problems. This paper proposes two competing solutions: dynamic frequency allocation and antenna tilting to those problems. Firstly a new dynamic fractional frequency allocation algorithm and a heuristic antenna tilting algorithm are designed. The comparative benefits of each algorithm are investigated. Secondly, the additional benefits of applying the two algorithms sequentially or iteratively are evaluated. The benefits of iteratively integrating the two algorithms are more interesting. Such integration solution allows the two algorithms to be applied cooperatively. The evaluations are based on high demand scenarios and base station failure scenarios. The results show that for the high demand scenarios the new dynamic fractional frequency allocation algorithm is very powerful, and the advantage of antenna tilting is not large though present. However, for the BS failure case there is a marked additional benefit in antenna tilting. The integrated solution achieves significantly more benefit than simple sequential application of the two algorithms.

Report from EACL 2012

13th Conference of the European Chapter of the Association for Computational Linguistics

Day 1:

Workshop on Semantic Analysis in Social Media (SASN2012)

The first half of the day had some pretty interesting talks: on unsupervised part-of-speech tagging for social media, emotional stability on Twitter, speech act tagging for Twitter, topic classification for blogs. The second half was a bit less interesting (IMHO) as it focused more on tools/software, but note this one on predicting Dutch election results. (full workshop proceedings available online here)

Day 2:

Workshop on Computational Models of Language Acquisition and Loss

Mark Steedman’s keynote on CCG grammar induction from semantics was very interesting – there’s little information in the workshop abstract but see the related EACL conference paper. (full workshop proceedings available here).

Workshop on Unsupervised and Semi-Supervised Learning in NLP

Generally interesting for techniques, but mostly applied to standard text tasks (parsing, coreference resolution etc which I find it hard to get excited about). But one on child language acquisition. (full workshop proceedings available here).

Main Conference:

The keynote speeches were great: Martin Cooke on how to make speech more intelligible without necessarily making it louder; Regina Barzilay on using reinforcement learning to learn language/semantics directly from task success; Ray Mooney on learning language from context (although I missed that one to come back & give revision lectures …)

Some other highlights for me: Heriot-Watt’s demo of their most recent POMDP dialogue system; Postdam/Bielefeld’s experiments on improving NLU by using incremental pragmatic information; some nice stuff on unsupervised learning of semantic roles; and possibly the worst talk I have ever had the misfortune to sit through (I won’t link to it but it’s on paraphrase generation via machine translation, if you really want to find it. I’m sure the paper’s excellent).

Oh, and my & Stuart’s paper on Twitter emotion detection of course.

Full proceedings available here.

Report from the Passive and Active Measurement Conference 2012

http://pam2012.ftw.at/

PAM is the oldest Internet measurement conference, started in 2000. This year’s edition took place in Vienna in March 2012.

The keynote was given by Yuval Shavitt (Tel Aviv University) on “Internet Topology Measurement: Past, Present, Future”. Topology measurements are still an active area of research, as our visibility of the Internet topology is still limited, and subject to significant biases whose impact is still being investigated by the research community.

Session on Malicious Behavior
- Detecting Pedophile Activity in Bittorent Networks: P2P networks are used for various illegal activities, such as pedophile.
- Re-Wiring Activity of Malicious Networks: Malicious networks tend to loose their connectivity when their activities are spotted. This study looks at the visibility of network re-wiring.

Session on Traffic Evolution and Analysis
- Unmasking the Growing UDP Traffic in a Campus Network: UDP traffic is becoming more and more popular, for example in China it has been reported to be as high as 80% is some networks. This paper provides similar evidence from Korea.
- Investigating IPv6 Traffic—What happened at the World IPv6 Day? Despite efforts such as the IPv6 world day, there is still little IPv6 traffic in the Internet. This paper studies what happened during the IPv6 World Day based on two vantage points, one campus network in the US and a large IXP in Germany.
- An End-Host View on Local Traffic at Home and Work: This paper compares local and wide-area traffic from end-hosts connected to different home and work networks.
- Comparison of User Traffic Characteristics on Mobile-Access versus Fixed-Access Networks: Mobile traffic is growing, and we still do not know much about how it differs from traffic seen on wired networks.

Session on Evaluation Methodology
- SyFi: A Systematic Approach for Estimating Stateful Firewall Performance: Firewalls are pervasive in today’s Internet given the need to protect networks from attacks. This paper builds a predictive model of the throughput achieved by commercial firewalls.
- OFLOPS: An Open Framework for OpenFlow Switch Evaluation: Current OpenFlow implementations have different levels of maturity. This work proposes a framework to test OpenFlow implementations capabilities.
- Probe and Pray: Using UPnP for Home Network Measurements: UPnP is nowadays becoming a popular active measurement platform. This paper studies the limitations as well as the usefulness of this platform.

Session on Large Scale Monitoring
- BackStreamDB: A Distributed System for Backbone Traffic Monitoring Providing Arbitrary Measurements in Real-Time. Distributed approaches for traffic monitoring are the future. Nice piece of work.
- A Sequence-oriented Stream Warehouse Paradigm for Network Monitoring Applications. Mining large-scale data is a pain, and networking is no exception. This paper proposes SQL extensions to ease the monitoring of networks by allowing to express sequence-oriented queries in a declarative language.
- PFQ: a Novel Engine for Multi-Gigabit Packet Capturing With Multi-Core Commodity Hardware. Unleashing the power of multi-cores to deliver amazing packet copying to user-space apps!

Session on New Measurement Initiatives
- Difficulties in Modeling SCADA Traffic: A Comparative Analysis: One of the few measurements of M2M traffic. Nothing surprising in terms of traffic properties, given the small scale of these traffic traces.
- Characterizing delays in Norwegian 3G networks: Yet another study of 3G networks. The talk generated heated comments on the limitations of the methodology.
- On 60 GHz Wireless Link Performance in Indoor Environments: Studies the use of 60 GHz wireless indoors and the conditions under which it works (LOS and NLOS). Pretty positive results…
- Geolocating IP Addresses in Cellular Data Networks: Very interesting geolocation paper that confirms previous work on the issues with geolocation databases.

Session on Reassessing Tools and Methods
- Speed Measurements of Residential Internet Access. How do bandwidth probing tools compare when using them to measure the available bandwidth of residential users?
- One-way Traffic Monitoring with iatmon. Using one-way delay measurements to track changes in traffic behavior and classifying different traffic sources.
- A Hands-on Look at Active Probing using the IP Prespecified Timestamp Option. IP options are not very widely used, despite their potential applicability. This work shows more evidence for the discrepancy between RFC and implementations.

Application Protocols
- Xunlei: Peer-Assisted Download Acceleration on a Massive Scale. A must-read: one of the pieces of the future in content delivery platforms!
- Pitfalls in HTTP Traffic Measurements and Analysis. What you should not trust from packet-level data when analyzing HTTP traces.
- A Longitudinal Characterization of Local and Global BitTorrent Workload Dynamics. Nice study of different types of content delivered through BitTorrent (file size, throughput, type of content).

Perspectives on Internet Structure and Services
- Exposing a Nation-Centric View on the German Internet – A Change in Perspective on the AS Level. Trying to define the AS-level ecosystem of Germany, still unclear whether it makes any sense, even though many defense agencies would like to be able to define it.
- Behavior of DNS’ Top Talkers, a .com/.net View. First ever analysis of the .com and .net TLD servers. Very interesting observations about IPv6 DNS, as well as the set of DNS resolvers that are the top talkers with the TLD servers. Must-read for DNS.
- The BIZ Top-Level Domain: Ten Years Later. Thinking about what will happen with the biz domain given the last 10 years or its use, especially defensive registrations. Must-read for DNS.

Report from IEEE INFOCOM 2012

http://www.ieee-infocom.org/program.html

Day 1: Mini-conference
Mini-conference papers are those that were discussed but not accepted in the main conference. They’re borderline papers, though sometimes more thought-provoking than
papers accepted at the main conference. Also, the variety in the topics of these
mini-conference papers is better than those accepted at the main conference.

Day 2: Conference opening, keynote, and afternoon sessions
Conference received about 1500 papers, less than 300 were accepted. A few awards were given during the opening. The keynote was given by Broadcom CTO. The topic was data-centers, very focused on the switch product line of Broadcom. Very limited comments on management or new topics such as OpenFlow were given despite being interesting for the research community.

The main conference was organized as 6 parallel tracks. 10 sessions addressed sensor networks design, showing the increased importance of this topic, strongly related to Internet of Things. A significant fraction of the papers deal with non-wired communications, such as sensors, wireless and mobile communications. Hot topics such as data-centers, cloud/grid, social computing, energy-efficiency, software-defined radio, are of course getting more attention than they used to in the past. The only missing topic surprisingly is optical communications.

Day 3
Sessions on cloud/grid were interesting, covering many aspects of the issues in cloud. INFOCOM being a rather applied theory conference, most of the papers address topics from an optimization, game theory, or performance evaluation viewpoint. The session on network optimization was the most interesting of the day in my opinion, with 3 papers from Google about traffic engineering on the Google network, worth reading.

Day 4
This last day of the conference was very interesting, with sessions on Internet measurement, Future Internet architectures, and Internet routing and router design. Multiple very interesting papers, such as:
- A Hybrid IP Lookup Architecture with Fast Updates: this paper proposes to fast IP lookups by using both TCAM and SRAM/FPGA to ensure updates have limited disruptive impact on lookups.
- Transparent acceleration of software packet forwarding using netmap: bypassing the TCP/IP stack through simplified drivers that allow applications to speak directly with the NICs.

Report from ACM EuroSys MPM2012 workshop

Measurement, Privacy, and Mobility (MPM 2012) 

http://www.cambridgeplus.net/MPM12/program.html

Keynote from Steve Uhlig on content delivery platforms, agile network measurement, and understanding the can ecosystem, the adaptation to change in demand is slow today, so it will be better to use virtualisation technologies to manage the demand shifts. There is growing infrastructure and storage diversity which allows for universal content delivery, so virtualisation can enable mobility and agile services.
dswiss. Secure safe
Attackers have a variety of methods for accessing the data, password solutions are not enough. It is possible to scan the whole ipv4 address space in a day. Trust on cloud providers is based on social prestige. They use secure remote password (srp) in order to avoid MITM attacks on passwords even on insecure channels. In addition to that, a number of key chains and symmetric and asymmetric keys are used to enable document sharing, however if the user forgets their password AND their recovery code, the data is deleted. Encourages providers to prevent employee access to data.

David Evans, malfunction analysis and privacy attacks
Sensors in buildings have privacy implications since they are not protected. Classifying data using tags, enable reflection of physical environment and do reasoning on privacy implications of sensing in the physical world. Tags can be based on sensor, location and time. This allows for analysis of sensitivity of data in different context and using different data sources in conjunction with one another.

Miguel Nunez, Markova based location mobile prediction
Predicting trajectories is important for services such as content delivery, tourist information, weather reports etc. this has been done using raw trajectories or clustering of trajectories using semantic mapping. Using Markova models allows probabilistic prediction of sequence of states, using density joint clusters. They used microsoft geolife data set and their own data set to train and test the n-MMC which gives them about 70-80% accuracy especially for higher number of user POIs.
ANOSIP: Anonymizing the SIP Protocol
Iraklis Leontiadis (Institute Eurecom)
SIP Used often for phone conferencing, with text based call flow messages. The aim of the work is to protect the ID of user from the call portals or man in the middle attacks. Use a number of techniques to achieve this.

Online Privacy: From Users to Markets to Deployment
Dr Vijay Erramilli (Telefónica I+D Research, Spain)
Economic model of web: free service for personal data, so advertising and economy is the main driver. They want to understand monetization aspect. Check paper on arxiv. They carried out questionnaire using browser plugin to ask users about value of their actions. Highly revisited data and sites yield high gains. Conducting economics and marketing solutions to understand the ecosystem more.

Confidential Carbon Commuting
Chris Elsmore, Anil Madhavapeddy, Ian Leslie, and Amir
Understanding employee commute is important, however it is hard to collect the data. University used an app to collect user data. Personal container is used for data aggregation. It allows sensitive questions to be asked about employee habits. Check lockerproject.org
The Impact of Trace and Adversary Models on Location Privacy Provided by K-anonymity
Volkan Cambazoglu and Christian Rohner (Uppsala University)
Used trace generation on different walk models for simulating locations. Used k-anonymity for identity protection and obfuscation for time of event hiding.

An Empirical Study on IMDb and its Communities Based on the Network of Co-Reviewers , Maryam Fatemi and Laurissa Tokarchuk
Interaction between people and content on social networks is important. There are a number of recommendation systems available but they suffer from shortcomings. A number of methods are used for comparison of movie review communities on imdb. So must take into account genres and context.

Providing Secure and Accountable Privacy to Roaming 802.11 Mobile Devices , Panagiotis Georgopoulos, Ben McCarthy, and Christopher Edwards
Mobile devices require connectivity and security. Differences in protocols and accesses point configs effect user mobility. An eduroam equivalent can work. Use CUI RFC4372. The idea is that request is anonymous access network, but relays alias to home network for authentication. Real ipv6 deployment test is done in lancaster.

When Browsing Leaves Footprints – Automatically Detect Privacy Violations
Hans Hofinger, Alexander Kiening, and Peter Schoo (Fraunhofer Research Institution for Applied and Integrated Security AISEC)
Introduced prividor, privacy violation detector via browser add-on. There are a large number of web techniques for user tracking such as cookies and scripts. A database is used for keeping track of bad sites, in addition to code checking. A centralised version is chosen for better management.

 

EPSRC COMNET Workshop report, February 9-10, QMUL

Report for EPSRC Workshop on Social Networks and Communications

http://www.commnet.ac.uk/node/42

9-10 February, Queen Mary University of London

Organisers: Hamed Haddadi, Laurissa Tokarchuk, Mirco Musolesi, Tristan Henderson

The COMNET workshop was aimed at bringing leading researchers and
academics working within the Digital Economy and Networking research
in UK together. Over the two days, over 50 people from academic and
industrial institutions attended the workshop. The workshop was very
interactive, with a very low number of engaging talks, a number of
“proposal writing” and “challenge solving sessions, and a high number
of informal introductions and project bootstrapping. The high number
of emails, messages and interactions on social networks afterwards was
indicative of the success of the workshop, a program for which can be
found on http://www.commnet.ac.uk/node/42 .

The keynote talk was delivered by Professor Yvonne Rogers (UCL), expert in Human-Computer Interaction (HCI). She highlighted the need for designing equipment and websites, which are also suitable for elderly, disabled or less educated members of the society. She also demonstrated a range of simple products and ideas enabling shoppers to understand the healthiness of their products. The talk was followed by an individual introduction by every participant, where research interests and industrial relevance were discussed.

The afternoon session of the first day was followed by talks from Dr Abhijit Sengupta, (Unilever) and Dr Stuart Battersby, (Chatterbox Analytics) who both discussed the new use of digital media and social media for advertising and brand marketing. They highlighted the strong need for collaboration between graph theorists, complex network researchers and NLP experts in order to understand the large volume of data. This is also in line with EPSRC Big Data research focus area.

Cecilia Mascolo (Uni. of Cambridge) delivered the Friday morning talk on different aspects of research on social networks and challenges, which are to be solved. This talk was followed immediately by the second break out group exercise, which aimed to solve some of the challenges in making social networks more secure for users and useful for different organisations, being friendship recommendation websites or crowd control scenarios.

 

Professor Derek McAuley (Horizon Digital Economy) concluded the workshop with an overview of the discussions, challenges and ideas presented and brought up in the workshop, discussing possible potential avenues for research into digital economy. Some of which are listed below.
Overall, the participants discussed a number of ethical, securities, scalability issues around digital economy themed projects such as green networking, human-computer interaction, Online Social Networks and personal data.

The researchers highlighted a number of strategic areas where more
cooperation and collaboration between academia, industry and
governments is required:
i) Scaling up social science and scaling down complex systems
research: Currently, there is a big gap between social networks
researchers, focusing on long term monitoring and study of a very low
number of subjects, and complex systems researchers, trying to crunch
data about millions of users without focusing on individual
interactions. This gap needs to be narrowed.
ii) Clear ethics: researchers should take more responsibility towards
collection, storage and sharing of publicly available data, especially
since aggregation fo such data can ease correlations and inferences.
We can also drive this forward via innovative systems.
iii) Formation of incentives: experiments should aim to bring out
right incentives, and to include a diverse range of participants.
iv) Think Globally: Currently, The law is always lagging behind
technology and usually Technology designed in a single country and
deployed everywhere, so researchers must take into account ethical,
cultural and moral implications.
v) Digital inclusion: There is need for more work on HCI and easier
access technology for inclusion of older generation to the Internet
services.
As one researcher puts it, “ At some points I even felt like suggesting to the others that we should write a grant proposal about our ideas”, and another” a
very interesting and insightful workshop. I enjoyed the discussions immensely.” And some personal blog post:

http://www.syslog.cl.cam.ac.uk/2012/02/11/we-are-all-social-scientists-now-but-werent-we-born-that-way-anyway/

We acknowledge the EPSRC COMNET program for providing funding for this
exciting workshop and hope to be able to organize further future
events regularly.