(ns org.danlarkin.blog)

“phrase from nearest book” meme

  • Grab the nearest book.
  • Open it to page 56.
  • Find the fifth sentence.
  • Post the text of the sentence in your journal along with these instructions.
  • Don’t dig for your favorite book, the cool book, or the intellectual one: pick the CLOSEST.

Here’s mine:

“401 responses should include a WWW-Authenticate header field indicating the name of the protected realm.”

apply _is_ lazy (and I was wrong)

So like Chouser and Rich said apply is, in fact, lazy. spread does not walk the whole list — just enough to satisfy the function passed to apply.

So one can (apply str (range 10e6)) and the only limit will be how big of a string can fit in one’s heap.

In fact,

user=> (time (first (apply #(lazy-cat %&) (range 10e200))))
"Elapsed time: 0.351 msecs"
0

definitely shows that apply doesn’t evaluate the whole list.

Sorry guys :)

apply is not lazy

So I’ve been trying to figure out why clojure can’t deal with applying on a huge list. Here’s an example of the problem:

user=> (apply str (range 10e6))
java.lang.OutOfMemoryError: Java heap space (NO_SOURCE_FILE:0)

And here’s the source for apply straight from boot.clj in the clojure distribution (as of r1081):

(defn spread
  {:private true}
  [arglist]
  (cond
   (nil? arglist) nil
   (nil? (rest arglist)) (seq (first arglist))
   :else (cons (first arglist) (spread (rest arglist)))))
 
(defn apply
  "Applies fn f to the argument list formed by prepending args to argseq."
  {:arglists '([f args* argseq])}
  [#^clojure.lang.IFn f & args]
  (. f (applyTo (spread args))))

I included the source for spread too because it’s called from apply and it’s the source of the problem. spread walks the entire list it’s passed and returns a series of cons cells (a list). So using apply forces an eager (non-lazy) evaluation of the list — which is absolutely not what I want with such a large list.

So I need a lazy-apply… I wonder if that’s even possible.

UPDATE: I was wrong, apply is lazy. Thanks for the comments Chouser and Rich.

clojure-json memory usage

Today I decided to do a little bit of microbenchmarking on my clojure JSON encoder.

Once again I import the encoder with

user=>(ns foo (:require (org.danlarkin [json :as json])))
nil
foo=>

And I run a simple benchmark of a list of 10e5 items

foo=> (dorun [(time (json/encode (range 10e5)))])
"Elapsed time: 5535.336 msecs"
nil

That run is about the average running time on my machine. So it’s about 5.5 seconds for a list of 10e5 items, not bad I guess. But what happens when I try (range 10e6)?

foo=> (dorun [(time (json/encode (range 10e6)))])
java.lang.OutOfMemoryError: Java heap space (NO_SOURCE_FILE:0)

Uh oh! So why is it happening? My first guess was maybe my algorithm is keeping too much garbage as it iterates through the list but it turns out I can’t even create a list of 10e6 items in the first place:

foo=> (apply list (range 10e6))
java.lang.OutOfMemoryError: Java heap space (NO_SOURCE_FILE:0)

I can get the list of (range 10e6) to evaluate if I run the jvm with a max heap size of 1GiB (pass -Xmx1024m to java) but encoding it to JSON still blows the heap.

This would be a great time to hook up a debugger like JSwat and see what’s going on except that I can’t get it to run on my machine (NullPointerException when I start it up– I’ve got an email out asking for help).

I took a guess that maybe my excessive memory usage was coming from passing java.lang.String objects around everywhere so I started a new branch to use java.io.Writer objects which I believed to be more memory-efficient. It didn’t help this situation, though, since even with the writer branch I’m overflowing a 1GiB heap.

Hopefully in the near future I can get JSwat working and figure out what’s taking up so much RAM.

First clojure project

Today I wrote and uploaded to github my very first clojure project. It’s a JSON-encoder. It takes arbitrarily-nested clojure datastructure and returns a JSON-encoded string.

As the README says, installing is as simple as adding the src directory to your classpath and importing it into your namespace with something like

(ns foo (:require (org.danlarkin [json :as json])))

Using it is pretty simple too,

foo=> (json/encode [1,2,3,4,5])
"[1,2,3,4,5]"
foo=> (json/encode {"a" 1 "b" 2 "c" 3})
"{\"a\":1,\"b\":2,\"c\":3}"
 

It took me a few hours to code this up, a length of time I’m pretty happy with considering I haven’t done that much in clojure yet and I had to consult json.org quite a bit.

Next step: a JSON-decoder.

© Dan Larkin. This work is licensed under a Creative Commons Attribution 3.0 United States License.
Powered by WordPress with a modified GimpStyle Theme design by Horacio Bella.