#+TITLE: Internship on SimGrid with Martin Quinson and Anne-Cécile Orgerie.
#+DATE: [2016-05-17 Mon]--[2016-05-15 Fri]
#+AUTHOR: Simon Bihel
#+EMAIL: [[mailto:simon.bihel@ens-rennes.fr]]
#+WEBSITE: [[simonbihel.me]]
#+LINK: [[https://github.com/sbihel/internship_simgrid]]
#+LANGUAGE: en

* Introduction
  Distributed systems have grown to be immensely complex. To help study and
  improve these systems simulators have been developed. Along with the ease of
  access they allow testing new pieces of software while having the exact same
  environment each time. Extracting results from experiments is also easier as a
  simulator can have perfect knowledge of all simulated component. The matter of
  study can be the energy consumption, computing time, bandwidth usage... There
  has been a lot of simulators but few have been able to keep up with the
  evolution of distributed systems and be able to simulate grids alike clouds
  for example. [[LMC03][SimGrid]] is one of them and the one I have worked on during my
  internship. My work was focused on the simulation of clouds with
  elastic/dynamic tasks. These kind of tasks can be used for website requests
  like rendering markdown on a wikipedia page where the task is
  triggered/run/activated for each request. Depending of the usage the resources
  needed will fluctuate and compared to a normal task there isn't one
  computation duration as requests are made during a certain time, not all at
  once.  I proposed a way to formulate these task for the users of SimGrid and
  implemented it. Experiments have shown the validity~ of the contribution.

* Findings
** Bibliography
*** Writing
   *Background*
     + What's a cloud
     + Minimising the cost (in all forms) is a research challenge, particularly
       for fluctuating usage
     + Different types of scaling
     + Present the survey that has categorized these works
     + Simulation isn't used that much and it would give a lot
     + How are they categorized
     + Say that it gives good overview of what is needed to simulate
     + Say that it covered everything

     Cloud computing is a model that makes available infrastructures, platforms
     and software with a pay-as-you-go subscription. It aims to reduce the cost
     with a layer of virtualisation that allows virtual resources to be
     dynamically adjusted and occupied on-demand. The problem of using the
     minimal resources for the current demand/usage is still a research
     challenge that spans all layers and applications. This dynamic management
     of clouds is called cloud elasticity.

     <<NGS15>> has categorized works on cloud elasticity and allows to see which
     elements of a cloud infrastructure, platform or application/software are
     impacted. As it is for now most research works are evaluated on real clouds
     It is interesting for a distributed systems simulator to search what is
     needed for simulating cloud elasticity. If it is shown that research works
     on cloud elasticity can be evaluated on a simulator they would benefit from
     cost reduction, re-runable experiments, trust in results...

     In this survey proposals are categorized as follows. The scope is about
     what elements of a cloud the proposals work on. It can be the management of
     VMs, allocation of resources... Then there is the purpose of the proposal.
     Enhancing the /perfomances/ (to meet the SLA), reducing the /energy/
     consumption/footprint, being /available/ when needed and reducing the
     overall /cost/. Another dimension is the decision making. This is what a
     proposal add to an existing cloud to pursue its purpose. In addition to the
     scope there is the elastic actions performed by the proposals. As the scope
     is about what elements of a cloud are concerned, the elastic action is
     about what is done to them. Then there is the provider dimension that tells
     if there is only one provider or multiple ones. At last there is the method
     used by the proposal to evaluate itself, through real cloud, simulation or
     emulation.

     The survey gives a good overview on what elements of a cloud are
     manipulated to achieve cloud elasticity.

     But in the end..
     // How can I be sure that it has covered all cases?

     As the proposals are on reacting to variating usage, simulators need a way
     to express this fluctuating workload. We worked on elastic tasks that model
     tasks that are triggered regularly and with a usage that fluctuates over
     time.

   *State of the art* (Needs for elastic tasks and concurrent tools)
     + Which elements have to be simulated and how do they work ?
     + How workloads are generaly modelized
     + What others simulators have done

     Based on the classification of the survey, a simulator should allow the
     manipulation of scopes, the evaluation of the different purposes, make
     possible the elastic actions and allow multiple providers.

     // Tell that simgrid does all last 3 ?

     At the moment no simulator article talks about dynamic workload. On the
     other hand in the code of DCsim there was an interactive task and in the
     code of CloudSim there was an host with dynamic workload.

     // Go to contribution and go through each scope ?

     For en-actor scopes, the point of elastic tasks is just to generate usage
     so they can chose an application type depending on what kind of usage they
     want (cpu, disk...). For the different kinds of application type, elastic
     tasks have the same mechanism it is just the inherent micro-task that is
     repeated that changes. An elastic task will repeatedly execute an MSG task.
     Currently an MSG task can only simulate computing and message passing so
     only Multi-tier Applications can be simulated at the moment. Simulating
     disk and RAM usage would allow the simulation of databases, storage and
     thus generic application.

   *Contribution*
     What can my contribution do that is in the survey?
     + Only computational tasks for now, might be able to do storage/DBs tasks
       in the future. Generic ones won't be possible unless you can pass a task
       directly to an ET.
     + Need to use the host of a task when executing to allow vertical scaling
       and need to manage multiple hosts to allow horizontal scaling.
     + Multiple provider is possible but has to be coded.
     + Purpose?

*** References
+ Clouds
  - <<NGS15>>[[http://link.springer.com/chapter/10.1007/978-3-319-29919-8_12][Cloud Elasticity Survey]]. Survey on research work on cloud
    elasticity. Good overview of all research done on cloud elasticity. It gives
    hint at what people might want in SimGrid. Tons of references to papers that
    gives better understanding on the way of formulating workload and other
    stuff. Highlight: "Finally, more research on benchmarks is needed to better
    assess the quality of each of the proposals.".
  - <<ASPLOS12>>[[http://www.cs.rutgers.edu/~ricardob/papers/asplos12.pdf][DejaVu]]. Framework that enhance and accelerate resource
    allocation with e.g. caching. Used real traces for evaluation. Explains how
    to deal with dynamic workload. For their Hotmail traces they reference [[http://research.microsoft.com/pubs/144957/euro040-thereska.pdf][this]]
    article which acknoledge some people for it at the end.
  - <<GPRTB14>>[[http://ac.els-cdn.com/S0167739X1400003X/1-s2.0-S0167739X1400003X-main.pdf?_tid=4acfd48e-3871-11e6-afe5-00000aab0f6b&acdnat=1466597171_52db5c840097473a97294f899053a67b][Coordinating Managers]]. Uses RUBiS for experiments.
+ Simulation
  - <<CGLQS14>>[[https://hal.inria.fr/hal-01017319/PDF/simgrid3-journal.pdf][SimGrid]].
  - <<SOCC10>>[[http://research.microsoft.com/pubs/143358/socc10-spikes.pdf][Modeling workload spikes]]. Proposal for generating
    significant/realistic workload spike. "In the rest of the paper, workload
    volume represents the total workload rate during a five-minute interval."
    What differenciates them from some reated work is that they are interested
    in a minute scale. The use a normal workload and from it they multiply it to
    get spikes. Based on their generator they would only use triggerOnce for ET.
    They use Zipf's law.
+ Concurrent tools
  - <<CRBRB10>>[[http://www.buyya.com/papers/CloudSim2010.pdf][CloudSim]], [[https://github.com/Cloudslab/cloudsim][repo]]. It's a simulator of clouds. Quite famous but
    nothing on elastic tasks (HostDynamicWorkload in the code). Good background
    section, speciallly built for clouds. No elastic task and is apparently
    missing VM related stuff (see [[TKBL12]]).
  - <<TKBL12>>DCsim's [[https://github.com/digs-uwo/dcsim][repo]], [[http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6380046][paper1]], [[http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6727859][paper2]] and [[https://www.dmtf.org/sites/default/files/svm2012_presentation1.pdf][slides]]. Simulator for data
    centres to evaluate resource management. Potential users of SimGrid among
    its users, InteractiveTasks in the code.
  - Searched who cited DCsim. [[http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6380049][One]] paper was about comparing algorithms, [[http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6572981][another]]
    about switching strategies at runtime. They both seem to give details even
    if the code isn't available. Well I have no idea how this could by useful as
    they are describing experiments that have nothing to do with elastic tasks.
  - <<RUBiS>>[[http://rubis.ow2.org/][RUBiS]]. Benchmarking auction website.
  - <<YCSB>>[[http://delivery.acm.org/10.1145/1810000/1807152/p143-cooper.pdf?ip=131.254.104.45&id=1807152&acc=ACTIVE%20SERVICE&key=7EBF6E77E86B478F%2E9BD6B3DBCD4B0A3B%2E4D4702B0C3E38B35%2E4D4702B0C3E38B35&CFID=620961178&CFTOKEN=20477141&__acm__=1466600316_208bf65c16eed45e57cd254a778a1ecb][YCSB]]. Benchmarking Cloud Serving Systems.
+ Not relevant
  - [[http://ac.els-cdn.com/S1569190X1300124X/1-s2.0-S1569190X1300124X-main.pdf?_tid=0ede5a0c-2351-11e6-826f-00000aacb362&acdnat=1464274353_4043525da0d2e6c2cb9432f0a6955443][DCworms' paper]]. Simulation to study the energy-consumption of datacenters,
    part of CoolEmAll project. What's interesting for me is that it uses
    workflows to model workloads. Broad range of tools. But I think it's
    focusing on a model that allow better energy consumption analyzing.
    Globally it is very focused on having control on everything to get a precise
    evaluation of the energy consumption.
  - <<JNSI13>>[[http://download.springer.com/static/pdf/46/chp%253A10.1007%252F978-3-642-31552-7_39.pdf?originUrl=http%3A%2F%2Flink.springer.com%2Fchapter%2F10.1007%2F978-3-642-31552-7_39&token2=exp=1463995249~acl=%2Fstatic%2Fpdf%2F46%2Fchp%25253A10.1007%25252F978-3-642-31552-7_39.pdf%3ForiginUrl%3Dhttp%253A%252F%252Flink.springer.com%252Fchapter%252F10.1007%252F978-3-642-31552-7_39*~hmac=81aa15290d88a2cbd2017547f69672bbe5f6ce338b05eba1489ca37d2cfb1fa2][ISim]]. Took a look because it was speaking of dynamic workload.
    But it is a meta-scheduler and it performs workload consolidation for power
    management. In the end I think it has nothing to do with what I looking for.
+ Misc.
  - <<AS14>>[[http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6779436&tag=1][Survey]].
  - <<PGWK15>>[[http://ieeexplore.ieee.org/ielx7/7092813/7092808/07092927.pdf?tp=&arnumber=7092927&isnumber=7092808][Complement to simulations]].
+ Not categorized yet / Not read yet
  - <<FPK14>>[[http://ieeexplore.ieee.org/ielx7/6902666/6903436/06903474.pdf?tp=&arnumber=6903474&isnumber=6903436][Autoscaling]]. Autoscaling on heterogeneous resources and multiple
    levels of QoS requirements. It uses wikibench for the evaluation and runs it
    on real infrastructures...
  - <<HW13>>[[http://faculty.cs.gwu.edu/~timwood/papers/icac13_final.pdf][Memory caching]]. Adaptative distributed (autoscaling, evenly
    distributed load) memory caching. It uses wikibench for the evaluation but
    runs it on real infrastructures...
  - <<MVD12>>[[http://ieeexplore.ieee.org/ielx5/6297612/6298144/06298161.pdf?tp=&arnumber=6298161&isnumber=6298144][Profit-Maximizing Resource Allocation]]. Again doing experiments for
    real.
  - <<MDCSIM>>[[http://www.cse.psu.edu/~bus145/MDCSIM.pdf][MDCSim]]. Simulation platform for in-depth analysis of multi-tier
    data centers.
** Contribution
- Proposals.
  1. Real traces.
  2. Tasks like in DCsim with visit ratio (like how many times the task's
     triggered/launched).
  3. Generator function.
- Scenario: You have a website. Each time a page is loaded you have a task that
  is triggered. In real life you have one vm exclusively for this task and
  overall the amount of work depends on the activity of visitors overtime. Thus
  you want to express a task that has a fluctuating computing requirements and
  that lasts overtime (there is no fixed amount of computation to execute
  immediatly and use all resources available and kill when it's done).
- Criteria of quality for proposals.
  + Complexity for the user: describing elastic tasks just be at least familiar
    to normal tasks.
  + Size on disc/in memory: real traces take a lot of space so the description
    of fluctuations for an elastic task just be lighter.
  + Computing speed: elastic tasks should be able to be precise enough to avoid
    wrong simulations but without taking too much longer than current
    perfomances.
  + Expressiveness: expressing elastic tasks should be natural and close to
    setting up real dynamic cloud tasks.
  + Implementable in SimGrid: avoiding massive refactoring and using current
    code would be appreciated.
- e = new ElasticTask(comp_size);
  e.setTriggerRatioVariation(vector<date, ratio>);
  OR e.setTriggerTrace(FILE*);
  e2 = new ElasticTask(comp_size);
  e.addOutputStream(e2);
- Cases that the contribution should cover:
  + Horizontal scaling (number of VMs is modified).
  + Vertical scaling (dynamically configuring the CPU and the RAM and Disk
    size). /Should we deduce from that that DB tasks doesn't impact other stuff
    ?/
  + (Application) Live migration where only specific DBs are migrated instead of
    full VMs.
  + Application reconfiguration (i.e. application architectural change).
- Develop on S4U
- See maxmin code to find out why it's difficult to write a callback for VMs
- Processus Alice et Eve S4U
  2 .hpp for deployment and execution
  doc S4U 3.14
  Eve's a user that's gonne verify that the contribution's working
  See energy.cpp as an example of plugin

* Development

* Global Goals
** TODO Internship subject <2016-05-30 Mon>
** TODO Bibliography <2016-05-17 Tue>--<2016-05-27 Fri>
** TODO Contribution <2016-05-30 Mon>--<2016-06-17 Fri>
** TODO App + study <2016-06-20 Mon>--<2016-06-27 Mon>
** TODO Experiments <2016-06-28 Tue>--<2016-07-05 Tue>
** TODO Report writing <2016-07-06 Wed>--<2016-07-13 Wed>
** TODO Report 1.0 <2016-07-15 Fri>

* Journal
** Week 1 <2016-05-17 Tue>--<2016-05-20 Fri>
*** Things Done
- Read Introduction, Background and Architecture parts of the CloudSim's paper
  [[CRBRB10]]. Gave better understanding of cloud's layers and the difficulties
  added to grids.
- Opened the [[http://www.buyya.com/papers/gridsim.pdf][GridSim paper]], looked at some figures and closed it upon
  encountering pages of uml class diagram and code samples.
- Meet-up with Anne-Cécile and Martin. Better understanding of my role (how to
  express elastic tasks) and the context (other simulators, the point of this
  work, ...).
- Tweaked/Fixed vim/tmux/orgmode config stuff, [[https://github.com/sbihel/dotfiles][my dotfiles]].
- Looked resources on DCsim <<TKBL12>>. Said in 2012 that CloudSim is missing VM
  replication, VM dependences, work conserving cpu... Talks about reallocating
  resources to VM (not wasting cpu's unused shares/resources) and managing
  resources following fluctuating usage in general, but not elastic tasks. In
  the few examples, there is one about StaticPeak as a SimulationTask but all
  examples look the same, I must have missed something.
*** Blocking Points
- +Can't connect on irc through Inria's network ??+ Currently using a ssh
  tunnel.
- "lua5.2 found when lua5.3 is required" for -Denable_lua. Library for 5.3 not
  installed. /on OS X/
- libdw not found for -Denable_model-checking. /on OS X/
- +Should I focus on VM deployment (allocation, provisioning) or VM usage
  (management) ? ("les charges")+ VM usage. -> User is using the simulator to
  test it's allocator of VMs.
*** Planned Work
- [X] Install SimGrid from source
- [X] Autoconnect #simgrid on irc.oftc.net
- [X] Read tutorial [[http://simgrid.gforge.inria.fr/documentation.php]]
- [X] Go through tutorial [[http://simgrid.gforge.inria.fr/simgrid/3.13/doc/tutorial.html]]
- [X] See concurrent tools like DCsim and GridSim. Pay attention to VM charges.

** Week 2 <2016-05-23 Mon>--<2016-05-27 Fri>
*** Things Done
- DCsim's code. There is InteractiveTasks which might correspond to elastic
  tasks. It consists of default and max number of instances, resource size,
  normal service time, and visit ratio. I guess if the ratio changes over time
  the task become elastic.
- CloudSim's code. There is HostDynamicWorkload which might correspond to
  elastic taks. List of processing elements... Meh, looks like it's just for
  keeping up to date with perfomance degradation of the VM.
- Took a look at [[IS_p][ISim's paper]] because it was speaking of dynamic workload. But
  it is a meta-scheduler and it performs workload consolidation for power
  management. In the end I think it has nothing to do with what I looking for.
- Contribution proposal 1. Elastic task is like a server's requests log. The
  parts that aren't over 100% of usage are reduced as one task. And we deal with
  the other parts. Cons: long non excessive part translated into one task can
  lose a lot information (lot of usage on a short time can have effect on
  bandwidth usage for example?); if there is lot of peaks over the limit then
  there is a lot to deal with if it goes down between each peak. Maybe maths
  could help having a smarter decomposition.
- Contribution proposal 2. Like in DCsim a task is triggerred/visited regularly
  and to simulate the elasticity the ratio of visit has to be changed. Pros: the
  precision of the simulation depends on the precision of ratio changes given by
  the user, thus performances depend on the user (avoiding responsibilities
  ¯\_(ツ)_/¯); convenient for the user.
- Contribution proposal 3~. If we consider that elastic tasks never really end,
  we could play with the resources of the VMs on which it is executed and the
  task would use it fully. I guess that would be a way of doing proposal 2.
  Cons: playing with resources induce not simulating the real world and make
  falsifying the results because resources management has a huge impact on other
  stuff.
- Contribution proposal 4~. Generating function or history {date; value}*.
- Read [[http://ac.els-cdn.com/S1569190X1300124X/1-s2.0-S1569190X1300124X-main.pdf?_tid=0ede5a0c-2351-11e6-826f-00000aacb362&acdnat=1464274353_4043525da0d2e6c2cb9432f0a6955443][DCworms' paper]]. Simulation to study the energy-consumption of
  datacenters. Part of CoolEmAll project. Broad range of tools. What's
  interesting for me is that it uses workflows to model workloads. But I think
  it's focusing on a model that allow better energy consumption analyzing.
  Globally it is very focused on having control on everything to get a precise
  evaluation of the energy consumption.
- Explored wikibench.eu. Master thesis for large scale benchmark. Real traces
  from wikipedia with tools to reduce the intensity for example whilst keeping
  interesting properties. People like Guillaume Pierre are using it to evaluate
  autoscaling. More generally all work on cloud and application management can
  be evaluated with it.
- Wrote some sort of scenario file for proposal 1 and 2. Needs more work to have
  correct C code. There is no task duration because I don't feel it's natural
  for a dynamic task to have a predetermined duration. I guess the user will
  have to kill it or reduce the visit ratio to 0. Still need some work to have
  satisfying description of the visits ratio fluctuations for proposal 2. And
  the base example chosen (cloud-two-tasks) might not be the best because the
  two tasks aren't concurrents and have to be killed before starting another
  one.
- Criteria of quality for proposals.
  + Complexity for the user: describing elastic tasks just be at least familiar
    to normal tasks.
  + Size on disc/in memory: real traces take a lot of space so the description
    of fluctuations for an elastic task just be lighter.
  + Computing speed: elastic tasks should be able to be precise enough to avoid
    wrong simulations but without taking too much longer than current
    perfomances.
  + Expressiveness: expressing elastic tasks should be natural and close to
    setting up real dynamic cloud tasks.
  + Implementable in SimGrid: avoiding massive refactoring and using current
    code would be appreciated.
- Searched who cited DCsim. [[http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6380049][One]] paper was about comparing algorithms, [[http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6572981][another]]
  about switching strategies at runtime. They both seem to give details even if
  the code isn't available. Well I have no idea how this could by useful as they
  are describing experiments that have nothing to do with elastic tasks.
- While trying to write an introduction I think I wrote some sort of abstract.
  Well I guess I'll just have to fill-in to get a proper introduction.
*** Blocking Points
- [[https://books.google.fr/books?id=io6aBQAAQBAJ&pg=PA92&lpg=PA92&dq=cloud+simulation+dynamic+workload&source=bl&ots=HkoqPCSnzM&sig=Ko-BHh-jMjx_6IDhE67RnTHW3h4&hl=en&sa=X&ved=0ahUKEwih0d65lPDMAhVrB8AKHW0EBVwQ6AEIMjAC#v=onepage&q=cloud%20simulation%20dynamic%20workload&f=false][This paper]] says that [[http://www.ijsr.net/archive/v2i8/MTIwMTMxMjA=.pdf][this paper]] presents an approach at modeling dynamic
  workloads in CloudSim but I didn't understand why.
- Can't seem to find stuff about dynamic tasks/workload, only stuff like dynamic
  resource allocation.
- Haven't really found what injection is in NS-3.
- People have dealt without elastic tasks just fine. Is it really useful ? Can't
  find stuff about it so I guess it's hard to find potential users and their
  needs.
*** Planned Work
- [X] Find other simulators. (e.g. survey cloud simulators).
- [X] See concurrent tools like DCsim and GridSim. Pay attention to varying
      workload. Read doc and source. When reading articles, summarize it.
- [ ] Connect to iwifi-interne.
- [ ] Write introduction.
- [X] Explain why DCworms isn't that useful.
- [X] Discover [[http://www.wikibench.eu/]]. What is it ? Who's using it ?
- [X] Write a formal scenario file that uses the proposals.
- [X] Find criteria to quantify the quality of the proposals. (e.g. complexity
      for the user; size on disc/in memory; computing speed; expressiveness;
      implementable in SimGrid)
- [X] Bibliography, which paper use DCsim, CloudSim, SimWare...
      Bibliography, find some papers of (potential) users that describe their
      setup.
- [ ] See workload injection (injecteurs de charge) in NS-3. Should be similar
      to what we're trying to do.
- [ ] Think about application workflows and interactions between interdependent
      (micro)(elastic)tasks.

** Week 3 <2016-06-06 Mon>--<2016-06-03 Fri>
*** Things Done
- Copied papers description in bibliography section.
- Took a look at [[FPK14]] and it does its evaluation on real infrastructures
  with wikibench. Lame? Same for [[HW13]] and [[MVD12]].
- Partly read [[NGS15]] and [[ASPLOS12]]. As DejaVu clusters workloads into
  classes, the proposal 2 (visit ratio) might be more convenient to study its
  reaction/adaptation (I'm assuming that the clustering doesn't have problems).
*** Blocking Points
- Still have a hard time figuring out what potential users would prefer for the
  API.
- Can a task know by itself when to update its visit ratio ?
*** Planned Work
- [X] More detailed entries for papers read. Abstract (1 sentence, objectives),
      link with my work, pros (what I'd like to reuse and what's worrying), cons
      (what I should say in my article). For the papers' names use the writers'
      names fist letters or the name of the conference.
- [X] Put the papers descriptions in the bibliography section (write it like a
      related work section).
- [X] Write a scenario file (needs description). Put it in the contribution
      section.
- [X] Search for potential users through wikibench citations.
- [ ] See load injectors of NS-3 because it's similar to what we're trying to
      do.
- [ ] See papers "multi-tiers applications" in [[<<NGS15>>][this.]]
- [X] Organize bibliography with categories.
- [ ] Propose clearer formulation of the elastic tasks API.

** Week 4 <2016-06-06 Mon>--<2016-06-10 Fri>
*** Things Done
- Worked on writing ElasticTask.hpp with the declaration of the class
  ElasticTask and an example of its use.
- [[https://github.com/sbihel/simgrid-1][Forked SimGrid.]] Started integrating Elastictask in s4u but that might change
  later to become a plugin.
- Examples of internship reports (bests from last year at ENS Rennes):
  [[http://perso.eleves.ens-rennes.fr/people/Timothee.Haudebourg/public/work/ecofen.pdf]],
  [[http://perso.eleves.ens-rennes.fr/people/Alexandre.Debant/work/rapport_stage_l3.pdf]],
  [[http://perso.eleves.ens-rennes.fr/people/Dominique.Barbe/derivationAI_long.pdf]],
  [[http://perso.eleves.ens-rennes.fr/people/Raphael.Berthon/docs/Berthon_Internship_2015.pdf]].
- What work is left to do compared to others? A friendly approach to the
  problem. A more developed analysis of the state of the art. More meaningful
  purpose of the work.
*** Blocking Points
*** Planned Work
- [X] .hpp of elastic task (API proposition).
- [X] Read the survey in detail to avoid missing uses/POVs of clouds.
- [ ] Develop the idea of resizing VMs for another POV of clouds (where you
      search to lower price of overcost of what you make available to users)
- [X] Compared to good interns reports say what's left to do.

** Week 5 <2016-06-13 Mon>--<2016-06-17 Fri>
*** Things Done
- Filled the holes in the code.
- Worked on background and state of the art.
- Meeting notes
  State of the art is about models used
  Don't write sentences, use itemize
  The contribution is a model
  See the article modeling workload spikes cause they do what we want
  Use set/getData(), attach data to actor (data examples: )
  ElasticTask should be call ElasticTaskManager
  MSG_task can't be create once and executed multiple times -> give what's
  needed to create the tasks
- Meeting notes
  The ETM is global and ET changes the datas of the ETM and when it wakes up it
  look what it has to do.
- Meeting notes
  Wake up using samephor
  timeandwait
  execute(flops) for each micro task
  no tasks just nextEventQueue
  when a microtask is executed and you add another
  execute_init() execute_start()
- Meeting notes
  write what I understood of the modeling spikes paper, look what proba law they
  use
  use class instead structs
  which parts of the API that answers to applications of the survey
  think of examples
*** Blocking Points
*** Planned Work
- [X] <2016-06-13 Mon 17:00> Compared to good interns reports say what's left to
      do.
- [X] Setup your own project ; don't touch pimpl_ just use regular msg tasks
- [X] <2016-06-15 Wed 09:00> Write background and state of the art using the
      survey. (Explain what information there is in it, how the studies are
      classified, the good ideas, its limits...)
- [X] Read the paper on modeling workload spikes.
- [X] Work on the code
- [ ] Which part of the survey is covered by the API, which might in the future
      and which won't.

** Week 6 <2016-06-20 Mon>--<2016-06-24 Fri>
*** Things Done
- If we try to simulate the workload generator of <<SOCC10>>. Normally we have
  each client thread that execute a request in a loop. Each thread selects a
  requests type, selects parameters, sends the requests, waits for a response
  and repeats. If we had to translate it we'd need to create a task that trigger
  one time the ET when it is finished. As request have parameters I guess we
  would need one ET for each request and parameters, then clients trigger one of
  them. We don't use the repeating triggering (ratio stuff) here.
- If we try to simulate DejaVu <<ASPLOS12>>. "Both traces contain measurements
  at 1-hour increments during one week, aggregated over thousands of servers.".
  For each kind of request there would be an ET and we would put a constant
  triggering over 1 hour and change the ratio each hour.
- Attended "Journées scientifiques". "Vérfier et corriger les logiciels",
  "Modélisation pour la biologie et la médecine" and "Vers une informatique
  ouverte et reproductible".
- A lot of papers use RUBiS. It's an auction website benchmark. Three kind of
  users session : visitor, buyer and seller. We could have juste 3 ET with maybe
  complex microtasks as users can see bids and bid themselves.
- A lot of papers use YCSB. "Each workload represents a particular mix of
  read/write operations, data sizes, request distributions, and so on, and can
  be used to evaluate systems at one particular point in the performance space."
  Four different kind of ET and it choses one random each time, so if we compute
  proactively the number of time operations will be chosen we could use the
  repeating characterisitc of ETs. One thing, as there are multiple records to
  read/write we would have more than 4 ETs. We would need more than computing
  tasks as reading records can vary depending on the writes.
- Meeting notes
  Ask if there is a detach for microtasks,
  Still a while(1) and use a semaphor_acquire
  talk with gabriel
- Probable don't need ETM to be an Actor
- Meeting notes
  Use futures to do the microtasks
  Still a msg_task in the end, future is controling the execution
*** Blocking Points
- Segfault when calling etm->run();
*** Planned Work
- [ ] Write examples.
- [ ] Write correct execute code for ETM.

** Week 7 <2016-06-27 Mon>--<2016-07-01 Fri>
*** Things Done
- Meeting notes
  Examples and see the time to do it and the load which is equivalent to a paper
  (run multiple experiments by increasing the rate/number of ET and see the
  overall time).
  One figure that shows the number of microtasks over the time (the little boxes
  with the start and end...).
- Experiments planning
  DejaVu. "Two servers. Intel SR1560 Series rack servers with Intel Xeon X5472
  processors (eight cores at 3 GHz), 8 GB of DRAM, and 6 MB of L2 cache per
  every two cores. EC2 cluster of 20 virtual machines. To demonstrate DejaVu’s
  ability to scale out, we vary the number of active instances from 2 to 10 as
  the workload intensity changes, but resort only to EC2’s large instance type.
  In contrast, we demonstrate its ability to scale up by varying the instance
  type from large to extra-large, while keeping the number of active instances
  constant." Multi-tier apps, serving static and dynamic content, DB
  interractions... Evaluate time for increasing ETs rate.
  + Platform similar to DejaVu.
- Meeting notes
  2Gflops for one host seems pretty standart
  multiple hosts, one cluster
  try to imitate the papers that use simulate
  One experiment to show performances
  Show with experiments which (interisting) studies it allows
  One slide : what they wanted to do, how do we do it in simgrid
  4 subsection in experimentation : performances, functionnalities (one for each
  paper)
  see cfg=tracing, should be autocmatic
  for performances do n hosts and n ET (one for each)
  5 pages of what people do, why they need to evaluate using my work
- 5 papers using simulation: OMNeT++, home-made python discret event simulator
  that models a service deployed in the cloud (WC98 traces), ??,
  SPECjEnterprise2010 (WC98 traces), SPECjEnterprise2010 (WC98 traces)
  5 papers using simulation: generic??, vertical scaling, vertical scaling,
  generic??, vertical scaling??
- Meeting note
  ET should be able to able to take a file text of timestamps
  from WC98 do one ET with an average flops and use ^
  keep the file opened and add tasks over time (like add a new parameter like
  repeating)
  translate WC98 to a timestamp (one timestamp per line)
  use XBT::translateinteger
  3 types of experiments: functionnalities, traces, perf
  add deadline with outputFunction
  add probalistic law
- Meeting notes
  use deployment file
  put scripts in begin{example} in reporting.org, so that it can be executed
  with C-c C-c
  look [[https://github.com/taisbellini/aiyra/blob/master/LabBook.org]]
- Experiments of the weekend
  + 0.80s user time to execute the 1000 requests test log file of WC98, with one
    ET and 10 hosts (18+ seconds from real traces) and ~14000 MB max memory
    Tried the day20 of WC98 with ~2million requests with 2000 hosts but after a
    few hours it bricked my macbook and it restarted.
  + 2nd experiment for raw perfs,
- Lost a day figuring out my queue has the biggest element on top instead of the
  lowest FeelsGoodMan
#+BEGIN_SRC cpp
#include <xbt/sysdep.h>
#include "simgrid/s4u.h"
#include "ElasticTask.hpp"
#include "simgrid/msg.h"

XBT_LOG_NEW_DEFAULT_CATEGORY(s4u_test, "a sample log category");

void eve(std::shared_ptr<simgrid::s4u::ElasticTaskManager> etm, double loadIncrease) {
  XBT_INFO("Starting");
  simgrid::s4u::ElasticTask *e1 = new simgrid::s4u::ElasticTask(simgrid::s4u::Host::by_name("cb1-2"), 5.0, 0.0,
      etm.get());
  simgrid::s4u::ElasticTask *e2 = new simgrid::s4u::ElasticTask(simgrid::s4u::Host::by_name("cb1-3"), 5.0, 0.0,
      etm.get());
  e1->setOutputFunction([e2]() {
      e2->triggerOneTime(1.5);
  });
  simgrid::s4u::ElasticTask *e3 = new simgrid::s4u::ElasticTask(simgrid::s4u::Host::by_name("cb1-4"), 5.0, 0.0,
      etm.get());
  for(int i = 5; i < 20; i++) {
    e3->addHost(simgrid::s4u::Host::by_name("cb1-" + std::to_string(i)));
  }
  e3->setTimestampsFile("d81_timestamp_wc.txt");
  simgrid::s4u::this_actor::sleep(99999999);
  etm->kill();
  XBT_INFO("Done.");
}

int main(int argc, char **argv) {
  simgrid::s4u::Engine *e = new simgrid::s4u::Engine(&argc, argv);
  std::shared_ptr<simgrid::s4u::ElasticTaskManager> etm = std::make_shared<simgrid::s4u::ElasticTaskManager>();
  e->loadPlatform("dejavu_platform.xml");
  simgrid::s4u::Actor("ETM", simgrid::s4u::Host::by_name("cb1-1"), [etm] { etm->run(); });
  simgrid::s4u::Actor("main", simgrid::s4u::Host::by_name("cb1-1"), [etm] { eve(etm, 1.0); });
  e->run();
  return 0;
}
#+END_SRC
#+BEGIN_SRC cpp
#include <xbt/sysdep.h>
#include "simgrid/s4u.h"
#include "ElasticTask.hpp"
#include "simgrid/msg.h"

XBT_LOG_NEW_DEFAULT_CATEGORY(s4u_test, "a sample log category");

void eve(std::shared_ptr<simgrid::s4u::ElasticTaskManager> etm, int n) {
  XBT_INFO("Starting");
  simgrid::s4u::ElasticTask *ets[n];
  for(int i = 0; i < n; i++) {
    ets[i] = new simgrid::s4u::ElasticTask(simgrid::s4u::Host::by_name("cb1-" + std::to_string(i+1)), 5.0, 1.0,
                                           etm.get());
  }
  simgrid::s4u::this_actor::sleep(100);
  etm->kill();
  XBT_INFO("Done.");
}

int main(int argc, char **argv) {
  int argcE = 1;
  simgrid::s4u::Engine *e = new simgrid::s4u::Engine(&argcE, argv);
  std::shared_ptr<simgrid::s4u::ElasticTaskManager> etm = std::make_shared<simgrid::s4u::ElasticTaskManager>();
  e->loadPlatform("dejavu_platform.xml");
  simgrid::s4u::Actor("ETM", simgrid::s4u::Host::by_name("cb1-1"), [etm] { etm->run(); });
  simgrid::s4u::Actor("main", simgrid::s4u::Host::by_name("cb1-1"), [etm, argv] { eve(etm, std::stoi(argv[1])); });
  e->run();
  return 0;
}
#+END_SRC
#+BEGIN_SRC
<?xml version='1.0'?>
<!DOCTYPE platform SYSTEM "http://simgrid.gforge.inria.fr/simgrid/simgrid.dtd">
<platform version="4">
  <config id="General">
    <prop id="network/coordinates" value="yes"/>
  </config>

  <AS id="AS0" routing="Full">
    <cluster id="server1_cb1" prefix="cb1-" suffix="" radical="1-2000" speed="5000Gf" core="8" bw="125MBps"
             lat="100us"/>
  </AS>
</platform>
#+END_SRC
*** Blocking Points
*** Planned Work
- [ ] Write what I'm planning to do with the expreriments, what I wanna show...
- [ ] Write what I'm planning to say in my final report.

** Week 8 <2016-07-04 Mon>--<2016-07-08 Fri>
*** Things Done
- I upgraded the platform twice and it's visible in the memory usage
*** Blocking Points
*** Planned Work

** Week 9 <2016-07-11 Mon>--<2016-07-13 Wed>
*** Things Done
- Meeting notes
  - Use LNCS latex template
  + Look back at tables to shows what papers need and what I can do
  - Paper notes: NP -> theory isn't enough as there are too much NP problems
  - Only me as author and put advisors in acknoledgement
  - Before clouds people runned their own stuff, now with pay as you go for some
    parts
  - When saying that clouds are complex, say that so many problems are NP
    complete so theory isn't enough
  + using simulations that are simple, reproductible, simplistic
  - at the end of the introduction put an itemize to say what the main
    contributions are (tell simgrid): modelization of workload, implementation
    of an API, evaluation
  + the contribution is about unraveling concepts about cloud workload
  + main contribution of the survey: categorizing works
  + the problem I solve is answering a need
  - Put state of the art at the end and name it Related Works
  + Say what a typicial simulation is: resources, algorithms used, what they are
    evaluating, scenario
  + talk about simgrid in background
  + on start of contribution tell what is the characterization of a workload
    (what is a task/gridlet/cloudlet (a constant computing load) which doesn't
    match a fluctuating reactive cloud workload)
    we're proposing a discrete representation of a continous event (one argument
    is that a simulator is discrete, it's easier)
  + link the models to elements of the contribution (an ET represents a flux)
  + instead of floprate we used taskrates to be generic (make sure to define
    atomic/classic tasks)
  + from this, elastic actions types come naturally... list all actions possible
    (and using tables makes it easier) one paragraph for each
  - Change "Nb of ET" to "number of elastic task"
  - Say the total number of microtasks
  - Do another experiment with only one ET and an increasing rate
  - Add conclusion to experiments
  - Use table for real traces instead of tons of numbers and use same plan for
    other experiments
  + Use table for C to show which elastic actions are possible and which papers
    use them
- Meeting notes
  + to detect threshold we would need to set an alarm and be killed if we don't
    kill it before (that would be online, offline is easy we juste compute the
    time used) (the online way we can talk about it in conclusion as we didn't
    do it) (the callback is defined by the user)
  + say in introduction why I did this internship say in conclusion what I
    learned
  + conclusion: conclusion of what was done, what has still to be done, what I
    learned (what was a simulator, what was a research code, life in research
    environment)
- Experiment with only one ET and growing number of triggers per second
#+BEGIN_SRC
#include <xbt/sysdep.h>
#include "simgrid/s4u.h"
#include "ElasticTask.hpp"
#include "simgrid/msg.h"

XBT_LOG_NEW_DEFAULT_CATEGORY(s4u_test, "a sample log category");

void eve(std::shared_ptr<simgrid::s4u::ElasticTaskManager> etm, int n) {
  XBT_INFO("Starting");
  simgrid::s4u::ElasticTask *e3 = new simgrid::s4u::ElasticTask(simgrid::s4u::Host::by_name("cb1-2"), 1.0, n,
      etm.get());
  for(int i = 3; i < 200; i++) {
    e3->addHost(simgrid::s4u::Host::by_name("cb1-" + std::to_string(i)));
  }
  simgrid::s4u::this_actor::sleep(100);
  etm->kill();
  XBT_INFO("Done.");
}

int main(int argc, char **argv) {
  int argcE = 1;
  simgrid::s4u::Engine *e = new simgrid::s4u::Engine(&argcE, argv);
  std::shared_ptr<simgrid::s4u::ElasticTaskManager> etm = std::make_shared<simgrid::s4u::ElasticTaskManager>();
  e->loadPlatform("dejavu_platform.xml");
  simgrid::s4u::Actor::createActor("ETM", simgrid::s4u::Host::by_name("cb1-1"), [etm] { etm->run(); });
  simgrid::s4u::Actor::createActor("main", simgrid::s4u::Host::by_name("cb1-1"), [etm, argv] { eve(etm, std::stoi(argv[1])); });
  e->run();
  return 0;
}
#+END_SRC
- Meeting notes
  increase figures slides
  First small sentence followed by a really long one -> not good
*** Blocking Points
*** Planned Work
- [ ] Give back keys and pass and ethernet adapter.

* Conclusion