Compiler options is one of those subjects that can get decidedly more complicated as you descend the rabbit hole. Undoubtedly, developers using or creating C/C++/Assembly libraries in Android are seeking to compile the most optimal binary for as many devices as they can. I’ll try to give you a simple answer but this thread should serve as a dialog; as NDK versions, gcc versions and even compiler types change with time, a diversity of information, evidence and opinions is welcome.
At the time of this writing, NDKr9b is the most current set of toolchains. It contains 2 prebuilt, ARM, gcc toolchains with varying degrees of patches from different contributing bodies (e.g. Linaro) and processor groups. You may have also noticed LLVM-CLang thrown in the “toolchains” directory. More on that in a future post...
At the very least, I suggest beginning with:
Compiler version gcc 4.8.2: export NDK_TOOLCHAIN_VERSION=4.8 or --toolchain=arm-linux-androideabi-4.8
Cortex A9 optimizations: -mcpu=cortex-a9
Soft Floating Point: -mfloat-abi=softfp (though, NDKr9b supports hard float which may benefit you considerably)
NEON: -DHAVE_NEON=1 or more specifically -mfpu=neon-fp16
These are easily set on a per project basis. For example, check out the NDK Sample Project “hello-neon”.
You can add the above flags (-mcpu=cortex-a9 -mfloat-abi=softfp -DHAVE_NEON=1) to LOCAL_CFLAGS in the file: android-ndk-r9b/samples/hello-neon/jni/Android.mk
In android-ndk-r9b/samples/hello-neon/jni/Application.mk append:
NDK_TOOLCHAIN_VERSION := 4.8
For more information on floating point and NEON (VFPv4,v3, hard float, soft float, etc) please checkout Richard Earnshaw’s blog, ARM Cortex-A Processors and GCC Command Lines.
This is where you all tell me I'm wrong because of the A15 pipeline, memcpy, div optimizations and hardfloat direction the community is moving in. True, but for the time being, most of the Android mobile devices in the world are Cortex A9 (or newer) and you’ll want to achieve balance between those predominant (now inexpensive) devices who still want a good experience with your game, media or app, and those willing to pay a premium for a high performance devices like an 8 core Cortex A7/A15 SoC. This is going to get even more interesting as ARMv8 SoCs come online but the good folks behind the NDK at Google have already sorted out major toolchain differences with the Application.mk, APP_ABI options and I’ll post about both load-time and runtime binary selection methods in the future if anyone is interested.
Complicating factors will always exist. e.g. in older gcc versions (4.6) flags for Cortex A9 are actually harmful because they optimize for A8 pipelines. There is the aforementioned, ongoing, soft to hard float transition and there were build changes even in recent Android projects. There are also still a few non-NEON v7 SoCs out there though none are still in production. More upsides? The community is generally headed towards support of ARM® C Language Extensions (ACLE). GCC support is still in progress but you should expect GCC to define ARCH_ARM (the value will be 7 for Cortex A15) and __ARM_ARCH_PROFILE (the value will be 'A', or decimal 64 if you prefer, on Cortex A15). You will also see __ARM_FEATURE_FMA is defined. Division isn't covered by ACLE yet so you will still need to check the GCC extension __ARM_ARCH_EXT_IDIV (the ACLE stuff won't have trailing underscores).
If you’re really interested in individual SoC performance tuning the options are there. I look forward to discussion on how to cast the widest net in terms of performance on ARM SoCs.