Skip to content
This repository has been archived by the owner on Jul 16, 2020. It is now read-only.

Weekly Meeting 2016 09 01

Kristen Carlson Accardi edited this page Sep 8, 2016 · 3 revisions

Agenda

##Minutes

#ciao-project Meeting

Meeting started by kristenc at 16:08:19 UTC. The full logs are available at ciao-project/2016/ciao-project.2016-09-01-16.08.log.html .

Meeting summary

Meeting ended at 17:01:57 UTC.

Action Items

  • tcpepper to capture ciao logging issues in a github issue

Action Items, by person

  • tcpepper
    • tcpepper to capture ciao logging issues in a github issue
  • UNASSIGNED
    • (none)

People Present (lines said)

  • kristenc (87)
  • tcpepper (71)
  • markusry (47)
  • albertom (12)
  • jvillalo_mobl (7)
  • mcastelino (7)
  • mrkz (2)
  • ciaomtgbot (2)
  • sameo (2)
  • wdouglas (1)
  • Patifa (1)
  • leoswaldo (1)
  • obedmr (1)

Generated by MeetBot_ 0.1.4

.. _MeetBot: http://wiki.debian.org/MeetBot

###Full IRC Log

16:08:19 <kristenc> #startmeeting
16:08:19 <ciaomtgbot> Meeting started Thu Sep  1 16:08:19 2016 UTC.  The chair is kristenc. Information about MeetBot at http://wiki.debian.org/MeetBot.
16:08:19 <ciaomtgbot> Useful Commands: #action #agreed #help #info #idea #link #topic.
16:08:41 <kristenc> roll call.
16:08:44 <kristenc> 0/
16:08:52 <mrkz> o/
16:08:58 <sameo> o/
16:08:59 <leoswaldo> o/
16:09:03 <tcpepper> o/
16:09:19 <obedmr> o/
16:09:30 <kristenc> #topic opens
16:09:35 <kristenc> anyone?
16:09:48 <markusry> o/
16:09:48 <jvillalo_mobl> o/
16:10:29 <kristenc> ok - if there are no opens, lets move on.
16:10:43 <kristenc> #topic bugs
16:11:30 <kristenc> there are no P1 bugs.
16:11:58 <kristenc> #link https://github.com/01org/ciao/issues?q=is0X0P+0open+is0X0P+0issue+label0X0P+0bug+label0X0P+0P2
16:12:07 <kristenc> P2s - is anyone working on P2s?
16:12:29 <tcpepper> I think wdouglas: is on 504
16:13:50 <kristenc> I guess the other question is this - we have a few new ones, shall we check that the priority is correct?
16:13:58 <markusry> Sounds good.
16:14:03 <tcpepper> sure
16:14:13 <kristenc> #link https://github.com/01org/ciao/issues/510
16:14:15 <mcastelino> o/
16:14:35 <kristenc> #link https://github.com/01org/ciao/issues/497
16:14:56 <kristenc> #link https://github.com/01org/ciao/issues/500
16:15:18 <kristenc> https://github.com/01org/ciao/issues/504
16:15:23 <kristenc> #link https://github.com/01org/ciao/issues/504
16:15:30 <kristenc> Those are the 4 new P2s.
16:15:49 <Patifa> o/ <--running late
16:15:54 <tcpepper> refresh...there's one more 511
16:15:57 <markusry> Should 510 be a P1?
16:15:58 <tcpepper> https://github.com/01org/ciao/issues/511
16:16:36 <kristenc> 511 has no priority yet, so we'll need to assign that.
16:16:53 <kristenc> markusry, I don't know - what do you think? we've never had this functionality yet.
16:16:54 <tcpepper> can we go through the list in numerical order and hit each?
16:16:54 <markusry> Related to 511, I was wondering whether osprepare sets the proxy for docker to use
16:17:02 <kristenc> tcpepper, sure.
16:17:08 <kristenc> let's start with 497.
16:17:41 <tcpepper> this feels higher than P2 to me
16:17:57 <kristenc> hey, I was just about to vote for downgrading to a P3. :)
16:18:01 <tcpepper> hahaha
16:18:12 <kristenc> so this problem is weird.
16:18:16 <kristenc> it happens sometimes, not others.
16:18:29 <tcpepper> me feeling is that these oddities only get harder to debug as time goes by
16:18:34 <kristenc> the result is that sometimes you get the "name" and "description" field left out when you list the volumes.
16:18:39 <kristenc> to me this isn't a big deal.
16:19:03 <kristenc> as the key thing (the ID) is always correct.
16:19:07 <tcpepper> my sense is it means we have either a race or mem mgmt bug, both of which are dangerous to system stability and/or security
16:19:53 <kristenc> it only happens when you have a mix of using some volumes with name and description set, some without.
16:20:23 <kristenc> i reviewed the code - couldn't see any reason in ciao-cli for this to happen.
16:20:28 <tcpepper> can it be resolved trivially by requiring a name?
16:20:35 <kristenc> this is when I started being suspicious of gophercloud.
16:20:48 <kristenc> requiring a name is annoying to me.
16:21:01 <kristenc> but yes - that would fix the problem I think.
16:21:02 <jvillalo_mobl> I think the description isn't that important, but the name I think should be as I believe it's required by the block api
16:21:07 <jvillalo_mobl> **checks block api
16:21:36 <tcpepper> i woudn't want to require a description, but having at least name seems like not a too harsh requirement
16:21:42 <kristenc> the name isn't required required.
16:21:45 <jvillalo_mobl> no, name is an optional field when creating a volume
16:21:58 <tcpepper> ok then we can't require it
16:22:05 <jvillalo_mobl> how ever having name at least, is better for user experience
16:22:06 <kristenc> we can make ciao-cli provide one.
16:22:12 <kristenc> if one isn't provided.
16:22:25 <tcpepper> "--no name--" ?
16:22:39 <kristenc> "ciaovolume"
16:22:47 <tcpepper> that works for me
16:22:56 <jvillalo_mobl> ciavolume$(number)
16:22:57 <kristenc> well - I personally am not sure it's worth it.
16:23:03 <tcpepper> if we do that, I'm ok with dropping to P4, but we should keep the bug open
16:23:28 <jvillalo_mobl> I'll vote not to have many volumes with the same name, adding an identifier at the end would be nice to keep them different
16:23:59 <kristenc> I guess it's up to whoever owns this bug to decide if they want to do this workaround or root cause.
16:24:14 <kristenc> sameo: do you want this bug, or do you want to assign to someone else.
16:25:01 <sameo> kristenc: I can take it.
16:26:30 <kristenc> sameo: ok - you have to use a storage cluster to reproduce I think.
16:26:39 <kristenc> next bug.
16:26:52 <kristenc> 500.
16:27:40 <tcpepper> feature or bug?
16:28:08 <tcpepper> ie: do we want it cancellable?
16:28:33 <markusry> Bug
16:28:35 <markusry> For me
16:28:37 <kristenc> I could see that messing up your cluster state.
16:28:40 <kristenc> I would call it a bug.
16:29:00 <markusry> Until this osprepare was integrated I could hit Ctrl-C at any point in launchers run cycl
16:29:01 <markusry> e
16:29:08 <markusry> and it would quit cleanly
16:29:14 <markusry> I can no longer do this
16:29:18 <kristenc> I was actually wondering if it was a P1.
16:29:23 <kristenc> since this seems bad.
16:29:32 <tcpepper> markusry: is osprepare doing something on the system?
16:29:49 <tcpepper> it should synchronously install some stuff or not and then launcher should be as normal
16:29:58 <tcpepper> maybe it's leaked a goroutine
16:29:59 <markusry> Calling apt-get
16:30:10 <tcpepper> strange
16:30:11 <markusry> No it's just that it blocks until everything is installed
16:30:17 <tcpepper> oh
16:30:24 <markusry> doesn't it?
16:30:32 <tcpepper> it should
16:30:36 <tcpepper> the flow should be:
16:30:38 <tcpepper> start launcher
16:30:40 <tcpepper> install deps
16:30:43 <tcpepper> be a launcher
16:31:00 <tcpepper> if it's being a launcher while still installing deps, then it's got a missing wg
16:31:08 <markusry> Agreed.
16:31:14 <markusry> So we just need a way to cancel it
16:31:27 <markusry> If needed.
16:31:49 <markusry> If launcher gets a sigterm for example
16:31:54 <tcpepper> that's where I think it's a feature
16:31:55 <markusry> While osprepare is running
16:32:03 <tcpepper> this shouldn't normally happen
16:32:12 <markusry> But it's not a feature as it has introduced a regression in launcher
16:32:17 <tcpepper> and osprepare must be allowed to run to completion at least once
16:32:25 <markusry> In that launcher can no longer quit cleanly
16:33:33 <tcpepper> I see your point and can accept this as a low or high priority feature or bug
16:33:43 <tcpepper> all depends on what you're worried about / bothered by
16:34:40 <kristenc> mark originally placed the priority as P2, so presumably that reflects how he feels about it. it doesn't impact me.
16:34:48 <markusry> Well, launcher has been carefully designed to always exit cleanly
16:35:06 <markusry> And this breaks all that careful planning.
16:35:30 <tcpepper> I see two scenarios:
16:35:51 <tcpepper> 1) rare: osprepare has something to do and spends even minutes doing it and it must be done for launcher to run
16:35:51 <mcastelino> kristenc: I agree with mark, i think this is important...  as I have done this Ctrl-C specially when we I have gotten stuck due to some other issue
16:36:04 <tcpepper> 2) frequent: osprepare has nothing to do and spends less than a second doing it
16:36:15 <kristenc> ok - so we have 2 affected components. mcastelino is this a P2 for you, or a P1?
16:36:20 <tcpepper> in scenario #2 you want to be able to kill it
16:36:31 <tcpepper> is it an urgent requirement to kill it in #1
16:37:09 <markusry> What happens if the proxies aren't set up and apt-get hangs
16:37:11 <tcpepper> and how often in scenario #2 (the common case) will you manage to kill it in that tiny time window and is waiting the rest of the less than a second that noticable?
16:37:40 <tcpepper> there shouldn't be an apt-get unless dpkg shows the desired things aren't present
16:37:45 <markusry> I'd want to ctrl-c then.
16:37:55 <mcastelino> P2 is we have some sort of timeout and message saying do not kill now
16:38:06 <tcpepper> apt-get should only happen when you're missing dependencies required to run
16:38:24 <tcpepper> so I feel like you're saying launcher need to run cleanly when you don't want to run launcher
16:38:26 <markusry> But this is a use case I can see happening
16:38:47 <tcpepper> either way we've probably discussed it more than it would take to implement something sufficient to fix it
16:38:59 <markusry> Either when running launcher on a new machine or after launcher's deps have been updatyed
16:39:33 <markusry> I'm saying that launcher should always quit gracefully
16:39:40 <markusry> Regardless of what it's doing
16:40:01 <markusry> This is the way it has been designed and osprepare is violating this design currently.
16:40:07 <kristenc> markusry, mcastelino : so shall we leave this as P2 bug then? it does mean it will be deprioritized behind P1 features.
16:40:24 <markusry> P2 is fine.  I might even fix it myself
16:41:05 <kristenc> ok.
16:41:07 <kristenc> 504 then.
16:41:34 <tcpepper> wdouglas: can we assign this to you?
16:41:55 <markusry> Right now we can't assign him things.
16:41:58 <markusry> I'm trying to fix this
16:42:02 <tcpepper> ah that's not done yet
16:42:03 <tcpepper> ok
16:42:05 <kristenc> I agree with the P2 priority on this, and that it's a bug.
16:42:35 <kristenc> sounds like william is actively working on already.
16:42:38 <tcpepper> the only thing I'd note is I think this is not an osprepare issue, but rather a ciao issue
16:42:43 <tcpepper> ciao needs a logging abstraction
16:42:58 <kristenc> tcpepper, ok, do you want to edit the bug?
16:43:06 <markusry> But it's highest priority to fix in osprepare first
16:43:12 <markusry> As right now it does fprint
16:43:13 <markusry> f
16:43:26 <markusry> So there are essentially no logs
16:43:48 <kristenc> yes - this issue is specifically for osprepare now that I read it again.
16:44:19 <kristenc> do we need to file a new issue then for the "all of ciao" issue?
16:44:32 <markusry> Yes, I think this would be better
16:44:42 <tcpepper> yes b/c ciao has printf's all over the place too
16:44:52 <markusry> That's not good.
16:45:20 <kristenc> #action tcpepper to capture ciao logging issues in a github issue
16:45:20 <tcpepper> $ git grep Println | grep -v _test | wc -l
16:45:20 <tcpepper> 68
16:45:23 <markusry> But at least the other components have some logs. The problem with osprepare is than when I run launcher I don't see any evidence that it's doing anything
16:45:50 <tcpepper> $ git grep Printf | grep -v -e _test -e vendor | wc -l
16:45:50 <tcpepper> 199
16:45:50 <tcpepper> $ git grep Println | grep -v -e _test -e vendor | wc -l
16:45:50 <tcpepper> 33
16:46:05 <tcpepper> yes osprepare should be clear what it is doing
16:46:08 <markusry> Some of them are okay though
16:46:09 <albertom> and if osprepare failes to install the packages due to proxies. launcher keeps running but there is no indication of the failure
16:46:10 <kristenc> 508 - this has no priority.
16:46:23 <markusry> As we have command line tools like ciaolc
16:46:29 <markusry> Which need to print to stdoit
16:46:33 <markusry> and testcases as well
16:46:59 <kristenc> I'd give 508 a P2 personally.
16:47:23 <kristenc> but I'm not sure the impact.
16:47:35 <kristenc> would it just install a bunch of extra stuff it didn't need?
16:47:45 <tcpepper> P1...you can't run docker instances
16:47:46 <kristenc> or would it fail to work at all on Ubuntu?
16:47:49 <albertom> well, compute nodes will not be able to launch container because docker is missing
16:48:05 <kristenc> ok, P1.
16:48:16 <tcpepper> and it's a launcher bug technically
16:48:21 <kristenc> updated.
16:48:22 <tcpepper> osprepare does what a component asks for
16:48:39 <tcpepper> b/c I'm dumb, launcher asks for some KDE thing instead of the containers docker
16:48:44 <tcpepper> osprepare complies
16:48:46 * wdouglas reads up the log
16:48:53 <albertom> ok i will update the commit message
16:49:02 <kristenc> looks like albertom is working on this, correct?
16:49:09 <albertom> fix is 3 chars
16:49:11 <markusry> But docker has different names on different distros doesn't it
16:49:12 <albertom> and ready
16:49:13 <mrkz> yes, albertom already sent PR for that
16:49:17 <kristenc> albertom, thanks.
16:49:32 <kristenc> #510
16:50:04 <markusry> Okay, I agree.  The change is needed in launcher
16:51:42 <tcpepper> 510 I would say P3 or maybe P2 just b/c we're not actually usable and not quite going to be there on a P1 sort of schedule
16:52:03 <tcpepper> some near term sprint needs to audit our overall usability
16:52:06 <kristenc> 510 I made P2 because it was existing behavior. but it's got to be fixed before we can have actual users.
16:52:13 <kristenc> the fix isn't easy.
16:52:22 <tcpepper> yeah...this is going to be part of a broader sprint/epic
16:53:04 <kristenc> it isn't super hard, just need to modify datastore to include tenant info (or "public") and start checking everywhere to make sure you have permission to use that workload.
16:53:10 <kristenc> it's just not trivial.
16:53:52 <tcpepper> but it also implies there's some way to create these things besides having a user tenant shell on the machine controller runs on
16:54:09 <tcpepper> or today a root shell
16:54:31 <tcpepper> ie: every tenant today has to share a root shell on the controller
16:54:56 <tcpepper> resolving that aspect of "workloads are all public" is more than datastore
16:55:43 <kristenc> not sure what you mean by that - but my feeling is that we fix this as part of the workload validator feature.
16:55:55 <kristenc> so we have 5 minutes left.
16:56:06 <kristenc> 511 - this seems to be not a ciao problem.
16:56:13 <albertom> kristenc: well it mighty be
16:56:23 <kristenc> albertom, how so?
16:56:25 <albertom> the cmd call is not honoring environment variables
16:56:31 <kristenc> ah.
16:56:36 <kristenc> that is definitely a bug.
16:56:37 <albertom> if we make that call hobnor /etc/profile
16:56:42 <albertom> then it will work
16:56:46 <mcastelino> kristenc: I was late.. did we talk about the single VM setup from obed... it is almost ready to do with some minor changes... I tested it and it will make the test 0 config
16:57:08 <kristenc> mcastelino, no - we got started late and we're still scrubbing bugs.
16:57:39 <kristenc> albertom, I think P2 for this since only one distro is impacted.
16:57:45 <kristenc> although sounds like the fix is easy.
16:57:56 <albertom> agree
16:58:03 <kristenc> albertom, can you update the issue with your info about the env problems?
16:58:12 * albertom doing so
16:58:40 <kristenc> albertom, are you going to fix this, or should we assign it to tcpepper ?
16:59:29 <markusry> mcastelino: What do you mean will make the test 0 config?
16:59:45 <mcastelino> exactly than.. no configuration anymore for single VM
17:00:05 <tcpepper> it should go to ikey
17:00:09 <mcastelino> you can run it on your bare metal system with no configruation and it will just work (thanks to obedmr)
17:00:15 <tcpepper> or albertom
17:00:35 <kristenc> tcpepper, I can't assign it to ikey.
17:00:44 <kristenc> I can assign it to albertom though.
17:00:56 <albertom> i might bnot be able to work on it until tomorrow
17:01:07 <kristenc> albertom, no problem.
17:01:18 <markusry> Need to go.
17:01:45 <kristenc> ok, unfortunately we are out of time.
17:01:45 <mcastelino> markusry: out single vm wiki page will be just cd $GOPATH/src/github.com/01org/ciao/testutil/singlevm && ./setup.sh
Clone this wiki locally