[OmniOS-discuss] LX: real ksh93 broken

Dan McDonald danmcd at omniti.com
Wed May 10 20:32:03 UTC 2017


Again, thank you for the digging.  I'd be interested in why ksh's getconf() fails as well.

Thanks,
Dan

Sent from my iPhone (typos, autocorrect, and all)

> On May 10, 2017, at 3:44 PM, Ludovic Orban <lorban at bitronix.be> wrote:
> 
> Okay, I found what causes ksh to misbehave. It's in sh_init(), when shgd->lim.child_max is initialized with the results of getconf("CHILD_MAX"), see: https://github.com/att/ast/blob/master/src/cmd/ksh93/sh/init.c#L1289
> 
> I've commented out that line, hardcoded shgd->lim.child_max to 128, rebuilt and voila: ksh works as it should.
> 
> Now I have to dig into that getconf() method to figure out what the returned value is and where it's coming from. Sounds trivial, but my C is *very* rusty, the asm gcc generates doesn't look at all what the JVM's JIT generates (which gives me wrong reflexes as I'm used to the latter) and I'm not very familiar with mdb.
> 
> Oh well, that turned into a nice debugging re-training session which I very much needed. That reminds me the good old days at my first job when I was porting Linux apps to Solaris.
> 
> Thank you for maintaining such a well-designed and pleasant to use OS!
> 
> 
>> On Wed, May 10, 2017 at 3:59 PM, Dan McDonald <danmcd at omniti.com> wrote:
>> Wow, thank you for the further deep-diving.
>> 
>> > On May 10, 2017, at 5:21 AM, Ludovic Orban <lorban at bitronix.be> wrote:
>> >
>> > Looking at ksh' sources, my understanding is that job_post is stuck in that else clause:
>> >        else
>> >        {
>> >               /* create a new job */
>> >               while((pw->p_job = job_alloc()) < 0)
>> >                      job_wait((pid_t)1);
>> >               pw->p_nxtjob = job.pwlist;
>> >               pw->p_nxtproc = 0;
>> >        }
>> >
>> > Digging into the sources and stepping though the instructions of job_alloc and job_byjid it looks like ksh cannot allocate a job id as it believes they're all reserved. But so far, all this code is purely working on internal structures of ksh so a LX bug would have no impact.
>> >
>> > I'll continue looking into this as time permits and I'll post an update if I find anything worth mentioning.
>> >
>> 
>> Be careful of narrowing your focus too far.  I see some things worth considering:
>> 
>> 1.) If the "if" you're not showing me dependent on something in global state that may have been mis-initialized by an LX emulation bug?
>> 
>> 2.) Same question as #1, but applied to job_alloc() and job_wait().
>> 
>> I'm guessing LX in OmniOS is failing because I mismerged or plain forgot something, given that Nahum says he can run ksh93 on SmartOS just fine.
>> 
>> 
>> Please make sure you're looking at the bigger picture, but THANK YOU for the further investigation.
>> 
>> Dan
>> 
> 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20170510/54ab8ba4/attachment.html>


More information about the OmniOS-discuss mailing list